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Foreword 


Health and biomedicine are in the midst of revolutionary change. 
Health care, mental health, and public health are converging as discov- 
ery science reveals these traditional “silos” share biologic pathways and 
collaborative management demonstrates better outcomes. Health care 
reimbursement is increasingly framed in terms of paying for outcomes 
achieved through value-based purchasing and population health man- 
agement. Individuals are more engaged in their health and wellness 
decisions, using personal biomedical monitoring devices and testing ser- 
vices and engaging in citizen science. Systems biology is revealing the 
complex interactions among a person’s genome, microbiome, immune 
system, neurologic system, social factors, and environment. Novel bio- 
markers and therapeutics exploit these interactions. 

These advances are fueled by digitization and generation of data at 
an unprecedented scale. The volume of health care data has multiplied 
8 times since 2013 and is projected to grow at a compound annual rate 
of 36% between 2018 and 2025!. The rate of growth of biomedical 
research data is comparable”. When you consider recent estimates that 
socioeconomics, health behaviors, and environment—factors outside of 
the domain of health care and biomedicine—contribute as much as 80% 
to health outcomes’, the variety and scale of health-related data are 
breathtaking. 

Biomedical informatics provides the scientific basis for making sense 
of these data—methods and tools to structure, mine, visualize, and rea- 
son with data and information. Biomedical informatics also provides 
the scientific basis for incorporating data and information into effective 
workflows—techniques to link people, process, and technology into sys- 
tems; methods to evaluate systems and technology components; and 
methods to facilitate system-level change. 

Biomedical informatics grew out of efforts to understand biomedical 
reasoning‘, such as artificial intelligence; to develop medical systems, 
such as multiphasic screening°; and to write computer programs to solve 
clinical problems, such as diagnosis and treatment of acid-base disor- 
ders®. By the late 1970s, “medical informatics” was used interchangeably 
with “computer applications in medical care”. As computer programs 
were written for various allied health disciplines, nursing informatics, 
dental informatics, and public health informatics emerged. The 1980s 
saw the emergence of computational biology for applications such as 
scientific visualization and bioinformatics to support tasks such as 
DNA sequence analysis. 

Biomedical Informatics: Computer Applications in Health Care and 
Biomedicine provided the first comprehensive guide to the field with its 
first edition in 1990. That edition and the subsequent three have served 
as the core syllabus for introductory courses in informatics and as a 
reference source for those seeking advanced training or working in the 
field. The fifth edition carries on the tradition with new topics, compre- 
hensive glossary, reading lists, and citations. 
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I encourage people who are considering formal education in 
biomedical informatics to use this book to sample the field. The book’s 
framework provides a guide for educators from junior high to graduate 
school as they design introductory courses in biomedical informatics. It 
is the basic text for students entering the field. 

With digitization and data driving change across the health and bio- 
medicine ecosystem, everyone in the ecosystem will benefit from reading 
Biomedical Informatics and using it as a handbook to guide their work. 
The following is a sample of questions readers can turn to the book to 
explore: 
= Practicing health professionals—How do I recognize an information 

need? How do I quickly scan and filter information to answer a 

question? How do I sense the fitness of the information to answer my 

question? How do I configure my electronic health record to focus 
my attention and save time? How do I recognize when to override 
decision support? How do I analyze data from my practice to identify 
learning and improvement opportunities? How do I engage with 
patients outside of face-to-face encounters? 

= Quality improvement teams—How might we detect if the outcome 
we are trying to improve is changing in the desired direction? Are 
data available in our operational systems that are fit for that purpose? 

What combination of pattern detection algorithm, workflow process, 

decision support, and training might work together to change the 

outcome? How can we adapt operational processes and systems to 
test the change and to scale if it proves effective? 

= Discovery science teams—How do data about biological systems 
differ from data about physical systems? How do we decide when to 
use integrative analytic approaches and when to use reductionist 
approaches? How much context do we need to keep about data we 
create and how do we structure the metadata? How do we optimize 
compute and storage platforms? How might we leverage electronic 
health record-derived phenotype to generate hypotheses? 

= Artificial intelligence researchers or health “app” developers—What 
health outcome am I trying to change? Do I need a detection, 
prediction, or classification algorithm? What sources of data might 
be fit for that purpose? What type of intervention might change the 
outcome? Who would be the best target for the intervention? What is 
the best place in their workflow to incorporate the intervention? 

= Health system leaders—How do we restructure team roles and 
electronic health record workflows to reduce clinician burnout and 
improve care quality? How do we take advantage of technology- 
enabled self-management and virtual visits to increase adherence 
and close gaps in care? How do we continuously evaluate evidence 
and implement or de-implement guidelines and decision support 
across our system? How do we leverage technology to deploy 
context-sensitive just-in-time learning across our system? 

= Health policy makers—How might we enhance health information 
privacy and security and reduce barriers to using data for population 
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health, health care quality improvement, and discovery? To what 
degree is de-identification a safeguard? What combination of legislative 
mandate, executive action, and industry-driven innovation will 
accelerate health data interoperability and business agility? How 
might federal and state governments enable communities to access 
small area data to inform their collective action to improve community 
health and well-being? 


You have taken the first step in exploring these frontiers by picking up 
this book. Enjoy! 
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Preface to the Fifth Edition 


The world of biomedical research and health care has changed remark- 
ably in the 30 years since the first edition of this book was published. So 
too has the world of computing and communications and thus the 
underlying scientific issues that sit at the intersections among biomedi- 
cal science, patient care, public health, and information technology. It is 
no longer necessary to argue that it has become impossible to practice 
modern medicine, or to conduct modern biological research, without 
information technologies. Since the initiation of the Human Genome 
Project three decades ago, life scientists have been generating data at a 
rate that defies traditional methods for information management and 
data analysis. 

Health professionals also are constantly reminded that a large per- 
centage of their activities relates to information management—for 
example, obtaining and recording information about patients, consult- 
ing colleagues, reading and assessing the scientific literature, planning 
diagnostic procedures, devising strategies for patient care, interpreting 
results of laboratory and radiologic studies, or conducting case-based 
and population-based research. Artificial intelligence, “big data,” and 
data science are having unprecedented impact on the world, with the 
biomedical field a particularly active and visible component of such 
activity. 

It is complexity and uncertainty, plus society’s overriding concern for 
patient well-being, and the resulting need for optimal decision making, 
that set medicine and health apart from many other information- 
intensive fields. Our desire to provide the best possible health and health 
care for our society gives a special significance to the effective organiza- 
tion and management of the huge bodies of data with which health 
professionals and biomedical researchers must deal. It also suggests the 
need for specialized approaches and for skilled scientists who are knowl- 
edgeable about human biology, clinical care, information technologies, 
and the scientific issues that drive the effective use of such technologies 
in the biomedical context. 


Information Management in Biomedicine 


The clinical and research influence of biomedical-computing systems is 
remarkably broad. Clinical information systems, which provide com- 
munication and information-management functions, are now installed 
in essentially all health care institutions. Physicians can search entire 
drug indexes in a few seconds, using the information provided by a com- 
puter program to anticipate harmful side effects or drug interactions. 
Electrocardiograms (ECGs) are typically analyzed initially by computer 
programs, and similar techniques are being applied for interpretation of 
pulmonary-function tests and a variety of laboratory and radiologic 
abnormalities. Devices with embedded processors routinely monitor 
patients and provide warnings in critical-care settings, such as the 
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intensive-care unit (ICU) or the operating room. Both biomedical 
researchers and clinicians regularly use computer programs to search 
the medical literature, and modern clinical research would be severely 
hampered without computer-based data-storage techniques and statisti- 
cal analysis systems. Machine learning methods and artificial intelli- 
gence are generating remarkable results in medical settings. These have 
attracted attention not only from the news media, patients, and clini- 
cians but also from health system leaders and from major corporations 
and startup companies that are offering new approaches to patient care 
and health information management. Advanced decision-support tools 
also are emerging from research laboratories, are being integrated with 
patient-care systems, and are beginning to have a profound effect on the 
way medicine is practiced. 

Despite this extensive use of computers in health care settings and 
biomedical research, and a resulting expansion of interest in learning 
more about biomedical computing, many life scientists, health-science 
students, and professionals have found it difficult to obtain a compre- 
hensive and rigorous, but nontechnical, overview of the field. Both 
practitioners and basic scientists are recognizing that thorough prepara- 
tion for their professional futures requires that they gain an understand- 
ing of the state of the art in biomedical computing, of the current and 
future capabilities and limitations of the technology, and of the way in 
which such developments fit within the scientific, social, and financial 
context of biomedicine and our health care system. In turn, the future 
of the biomedical-computing field will be largely determined by how 
well health professionals and biomedical scientists are prepared to guide 
and to capitalize upon the discipline’s development. 

This book is intended to meet this growing need for such well- 
equipped professionals. The first edition appeared in 1990 (published by 
Addison-Wesley) and was used extensively in courses on medical infor- 
matics throughout the world (in some cases with translations to other 
languages). It was updated with a second edition (published by Springer) 
in 2000, responding to the remarkable changes that occurred during the 
1990s, most notably the Human Genome Project and the introduction 
of the World Wide Web with its impact on adoption and acceptance of 
the Internet. The third edition (again published by Springer) appeared 
in 2006, reflecting the ongoing rapid evolution of both technology and 
health- and biomedically related applications, plus the emerging govern- 
ment recognition of the key role that health information technology 
would need to play in promoting quality, safety, and efficiency in patient 
care. With that edition the title of the book was changed from Medical 
Informatics to Biomedical Informatics, reflecting (as is discussed in 
> Chap. 1) both the increasing breadth of the basic discipline and the 
evolving new name for academic units, societies, research programs, and 
publications in the field. The fourth edition (published by Springer in 
2014) followed the same conceptual framework for learning about the 
science that underlies applications of computing and communications 
technology in biomedicine and health care, for understanding the state 
of the art in computer applications in clinical care and biology, for cri- 
tiquing existing systems, and for anticipating future directions that the 
field may take. 
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In many respects, the fourth edition was very different from its prede- 
cessors, however. Most importantly, it reflected the remarkable changes 
in computing and communications that continued to occur, most nota- 
bly in communications, networking, and health information technology 
policy, and the exploding interest in the role that information technol- 
ogy must play in systems integration and the melding of genomics with 
innovations in clinical practice and treatment. Several new chapters 
were introduced and most of the remaining ones underwent extensive 
revision. 

In this fifth edition, we have found that two previous single-chapter 
topics have expanded to warrant two complementary chapters, specifi- 
cally Cognitive Science (split into Cognitive Informatics and Human- 
Computer Interaction, Usability, and Workflow) and Consumer Health 
Informatics and Personal Health Records (split into Personal Health 
Informatics and mHealth and Applications). There is a new chapter on 
precision medicine, which has emerged in the past 6 years as a unique 
area of special interest. Those readers who are familiar with the first 
four editions will find that the organization and philosophy are essen- 
tially unchanged (although bioinformatics, as a set of methodologies, is 
now considered a “recurrent theme” rather than an “application”), but 
the content is either new or extensively updated.! 

This book differs from other introductions to the field in its broad 
coverage and in its emphasis on the field’s conceptual underpinnings 
rather than on technical details. Our book presumes no health- or 
computer-science background, but it does assume that you are inter- 
ested in a comprehensive domain summary that stresses the underlying 
concepts and that introduces technical details only to the extent that 
they are necessary to meet the principal goal. Recent specialized texts 
are available to cover the technical underpinnings of many topics in this 
book; many are cited as suggested readings throughout the book, or are 
cited in the text for those who wish to pursue a more technical exposure 
to a topic. 


Overview and Guide to Use of This Book 


This book is written as a text so that it can be used in formal courses, but 
we have adopted a broad view of the population for whom it is intended. 
Thus, it may be used not only by students of medicine and of the other 
health professions but also as an introductory text by future biomedical 
informatics professionals, as well as for self-study and for reference by 
practitioners, including those who are pursuing formal board certifica- 
tion in clinical informatics (as is discussed in more detail later in this 


1 As with the first four editions, this book has tended to draw both its examples and 
its contributors from North America. There is excellent work in other parts of the 
world as well, although variations in health care systems, and especially financing, 
do tend to change the way in which systems evolve from one country to the next. 
The basic concepts are identical, however, so the book is intended to be useful in 
educational programs in other parts of the world as well. 
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“Preface”). The book is probably too detailed for use in a 2- or 3-day 
continuing-education course, although it could be introduced as a refer- 
ence for further independent study. 

Our principal goal in writing this text is to teach concepts in biomedical 
informatics—the study of biomedical information and its use in decision 
making—and to illustrate them in the context of descriptions of represen- 
tative systems that are in use today or that taught us lessons in the past. 
As you will see, biomedical informatics is more than the study of comput- 
ers in biomedicine, and we have organized the book to emphasize that 
point. > Chapter | first sets the stage for the rest of the book by providing 
a glimpse of the future, defining important terms and concepts, describ- 
ing the content of the field, explaining the connections between biomedi- 
cal informatics and related disciplines, and discussing the forces that have 
influenced research in biomedical informatics and its integration into 
clinical practice and biological research. 

Broad issues regarding the nature of data, information, and knowl- 
edge pervade all areas of application, as do concepts related to optimal 
decision making. > Chapters 2 and 3 focus on these topics but mention 
computers only in passing. They serve as the foundation for all that fol- 
lows. > Chapters 4 and 5 on cognitive science issues enhance the discus- 
sions in > Chaps. 2 and 3, pointing out that decision making and 
behavior are deeply rooted in the ways in which information is processed 
by the human mind. Key concepts underlying system design, human- 
computer interaction, patient safety, educational technology, and deci- 
sion making are introduced in these chapters. 

> Chapter 6 introduces the central notions of software engineering 
that are important for understanding the applications described later. 
We have dropped a chapter from previous editions that dealt broadly 
with system architectures, networking, and computer-system design. 
This topic is more about engineering than informatics, it changes rap- 
idly, and there are excellent books on this subject to which students can 
turn if they need more information on these topics. 

> Chapter 7 summarizes the issues of standards development, focus- 
ing in particular on data exchange and issues related to sharing of clini- 
cal data. This important and rapidly evolving topic warrants inclusion 
given the evolution of the health information exchange, institutional 
system integration challenges, federal government directives, and the 
increasingly central role of standards in enabling clinical systems to 
have their desired influence on health care practices. 

> Chapter 8 addresses a topic of increasing practical relevance in 
both the clinical and biological worlds: natural language understanding 
and the processing of biomedical texts. The importance of these meth- 
ods is clear when one considers the amount of information contained in 
free-text notes or reports (either dictated and transcribed or increasingly 
created using speech-understanding systems) or in the published bio- 
medical literature. Even with efforts to encourage structured data entry 
in clinical systems, there will likely always be an important role for tech- 
niques that allow computer systems to extract meaning from natural 
language documents. 

> Chapter 9 recognizes that bioinformatics is not just an application 
area but rather a fundamental area of study. The chapter introduces 
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many of the concepts and analytical tools that underlie modern compu- 
tational approaches to the management of human biological data, espe- 
cially in areas such as genomics and proteomics. Applications of 
bioinformatics related to human health and disease later appear in a 
chapter on “Translational Bioinformatics” (> Chap. 26). 

» Chapter 10 is a comprehensive introduction to the conceptual under- 
pinnings of biomedical and clinical image capture, analysis, interpretation, 
and use. This overview of the basic issues and imaging modalities serves as 
background for » Chap. 22, which deals with imaging applications issues, 
highlighted in the world of radiological imaging and image management 
(e.g., in picture archiving and communication systems). 

> Chapter 11 considers personal health informatics not as a set of 
applications (which are covered in > Chap. 19), but as introductory 
concepts that relate to this topic, such as notions of the digital self and 
the digital divide, patient-generated health data, and how a focus on the 
patient (or on healthy individuals) affects both the person and the field 
of biomedical informatics. 

> Chapter 12 addresses the key legal and ethical issues that have 
arisen when health information systems are considered. Then, in 
> Chap. 13, the challenges associated with technology assessment and 
with the evaluation of clinical information systems are introduced. 

> Chapters 14-28 (which include two new chapters in this edition, 
including one on mHealth and another on precision medicine) survey 
many of the key biomedical areas in which informatics methods are 
being used. Each chapter explains the conceptual and organizational 
issues in building that type of system, reviews the pertinent history, and 
examines the barriers to successful implementations. 

> Chapter 29 reprises and updates a chapter that was new in the 
fourth edition, providing a summary of the rapidly evolving policy 
issues related to health information technology. Although the emphasis 
is on US government policy, there is some discussion of issues that 
clearly generalize both to states (in the USA) and to other countries. 

The book concludes in > Chap. 30 with a look to the future—a 
vision of how informatics concepts, computers, and advanced commu- 
nication devices one day may pervade every aspect of biomedical 
research and clinical practice. Rather than offering a single point of 
view developed by a group of forward thinkers, as was offered in the 
fourth edition, we have invited seven prominent and innovative thinkers 
to contribute their own views. We integrate these seven future perspec- 
tives (representing clinical medicine, nursing, health policy, translational 
bioinformatics, academic informatics, the information technology 
industry, and the federal government) into a chapter where the editors 
have synthesized the seven perspectives after building on how an analy- 
sis of the past helps to inform the future of this dynamic field. 


The Study of Computer Applications in Biomedicine 
The actual and potential uses of computers in health care and biomedi- 


cine form a remarkably broad and complex topic. However, just as you 
do not need to understand how a telephone or an ATM machine works 
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to make good use of it and to tell when it is functioning poorly, we 
believe that technical biomedical-computing skills are not needed by 
health workers and life scientists who wish simply to become effective 
users of evolving information technologies. On the other hand, such 
technical skills are of course necessary for individuals with career com- 
mitment to developing information systems for biomedical and health 
environments. Thus, this book will neither teach you to be a program- 
mer nor show you how to fix a broken computer (although it might 
motivate you to learn how to do both). It also will not tell you about 
every important biomedical-computing system or application; we shall 
use an extensive bibliography included with each chapter to direct you 
to a wealth of literature where review articles and individual project 
reports can be found. We describe specific systems only as examples that 
can provide you with an understanding of the conceptual and organiza- 
tional issues to be addressed in building systems for such uses. Examples 
also help to reveal the remaining barriers to successful implementations. 
Some of the application systems described in the book are well estab- 
lished, even in the commercial marketplace. Others are just beginning to 
be used broadly in biomedical settings. Several are still largely confined 
to the research laboratory. 

Because we wish to emphasize the concepts underlying this field, we 
generally limit the discussion of technical implementation details. The 
computer-science issues can be learned from other courses and other 
textbooks. One exception, however, is our emphasis on the details of 
decision science as they relate to biomedical problem solving (> Chaps. 
3 and 24). These topics generally are not presented in computer-science 
courses, yet they play a central role in the intelligent use of biomedical 
data and knowledge. Sections on medical decision making and 
computer-assisted decision support accordingly include more technical 
detail than you will find in other chapters. 

All chapters include an annotated list of “Suggested Readings” to 
which you can turn if you have a particular interest in a topic, and 
there is a comprehensive set of references with each chapter. We use 
boldface print to indicate the key terms of each chapter; the definitions 
of these terms are included in the “Glossary” at the end of the book. 
Because many of the issues in biomedical informatics are conceptual, 
we have included “Questions for Discussion” at the end of each chap- 
ter. You will quickly discover that most of these questions do not have 
“right” answers. They are intended to illuminate key issues in the field 
and to motivate you to examine additional readings and new areas of 
research. 

It is inherently limiting to learn about computer applications solely 
by reading about them. We accordingly encourage you to complement 
your studies by seeing real systems in use—ideally by using them your- 
self. Your understanding of system limitations and of what you would 
do to improve a biomedical-computing system will be greatly enhanced 
if you have had personal experience with representative applications. Be 
aggressive in seeking opportunities to observe and use working systems. 

In a field that is changing as rapidly as biomedical informatics is, it is 
difficult ever to feel that you have knowledge that is completely current. 
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However, the conceptual basis for study changes much more slowly than 
do the detailed technological issues. Thus, the lessons you learn from 
this volume will provide you with a foundation on which you can 
continue to build in the years ahead. 


The Need for a Course in Biomedical Informatics 


A suggestion that new courses are needed in the curricula for students 
of the health professions is generally not met with enthusiasm. If any- 
thing, educators and students have been clamoring for reduced lecture 
time, for more emphasis on small group sessions, and for more free time 
for problem solving and reflection. Yet, in recent decades, many studies 
and reports have specifically identified biomedical informatics, includ- 
ing computer applications, as an area in which new educational oppor- 
tunities need to be developed so that physicians and other health 
professionals will be better prepared for clinical practice. As early as 
1984, the Association of American Medical Colleges (AAMC) recom- 
mended the formation of new academic units in biomedical informatics 
in our medical schools, and subsequent studies and reports have contin- 
ued to stress the importance of the field and the need for its inclusion in 
the educational environments of health professionals. 

The reason for this strong recommendation is clear: The practice of 
medicine is inextricably entwined with the management of information. In 
the past, practitioners handled medical information through resources 
such as the nearest hospital or medical-school library; personal collec- 
tions of books, journals, and reprints; files of patient records; consulta- 
tion with colleagues; manual office bookkeeping; and (all-too-often 
flawed) memorization. Although these techniques continue to be vari- 
ably valuable, information technology is offering new methods for find- 
ing, filing, and sorting information: online bibliographic retrieval 
systems, including full-text publications; personal computers, laptops, 
tablets, and smart phones, with database software to maintain personal 
information and commonly used references; office-practice and clinical 
information systems and EHRs to capture, communicate, and preserve 
key elements of the health record; information retrieval and consulta- 
tion systems to provide assistance when an answer to a question is 
needed rapidly; practice-management systems to integrate billing and 
receivable functions with other aspects of office or clinic organization; 
and other online information resources that help to reduce the pressure 
to memorize in a field that defies total mastery of all but its narrowest 
aspects. With such a pervasive and inevitable role for computers in clin- 
ical practice, and with a growing failure of traditional techniques to deal 
with the rapidly increasing information-management needs of practitio- 
ners, it has become obvious to many people that an essential topic has 
emerged for study in schools and clinical training programs (such as 
residencies) that train medical and other health professionals. 

What is less clear is how the subject should be taught in medical 
schools or other health professional degree programs, and to what 
extent it should be left for postgraduate education. We believe that top- 
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ics in biomedical informatics are best taught and learned in the context 
of health-science training, which allows concepts from both the health 
sciences and informatics science to be integrated. Biomedical-computing 
novices are likely to have only limited opportunities for intensive study 
of the material once their health-professional training has been com- 
pleted, although elective opportunities for informatics rotations are now 
offered to residents in many academic medical centers. 

The format of biomedical informatics education has evolved as fac- 
ulty members have been hired to carry out informatics research and to 
develop courses at more health-science schools, and as the emphasis on 
lectures as the primary teaching method continues to diminish. Com- 
puters will be used increasingly as teaching tools and as devices for com- 
munication, problem solving, and data sharing among students and 
faculty. Indeed, the recent COVID-19 pandemic has moved many tradi- 
tional medical teaching experiences from the classroom to online teach- 
ing environments using video conferencing and on-demand access to 
course materials. Such experiences do not teach informatics (unless that 
is the topic of the course), but they have rapidly engaged both faculty 
and students in technology-intensive teaching and learning experiences. 
The acceptance of computing, and dependence upon it, has already 
influenced faculty, trainees, and curriculum committees. This book is 
designed to be used in a traditional introductory course, whether taught 
online or in a classroom, although the “Questions for Discussion” also 
could be used to focus conversation in small seminars and working 
groups. Integration of biomedical informatics topics into clinical expe- 
riences has also become more common. The goal is increasingly to pro- 
vide instruction in biomedical informatics whenever this field is most 
relevant to the topic the student is studying. This aim requires educa- 
tional opportunities throughout the years of formal training, supple- 
mented by continuing-education programs after graduation. 

The goal of integrating biomedicine and biomedical informatics is to 
provide a mechanism for increasing the sophistication of health profes- 
sionals, so that they know and understand the available resources. They 
also should be familiar with biomedical computing’s successes and fail- 
ures, its research frontiers, and its limitations, so that they can avoid 
repeating the mistakes of the past. Study of biomedical informatics also 
should improve their skills in information management and problem 
solving. With a suitable integration of hands-on computer experience, 
computer-mediated learning, courses in clinical problem solving, and 
study of the material in this volume, health-science students will be well 
prepared to make effective use of computational tools and information 
management in health care delivery. 


The Need for Specialists in Biomedical Informatics 


As mentioned, this book also is intended to be used as an introductory 
text in programs of study for people who intend to make their profes- 
sional careers in biomedical informatics. If we have persuaded you that 
a course in biomedical informatics is needed, then the requirement for 
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trained faculty to teach the courses will be obvious. Some people might 
argue, however, that a course on this subject could be taught by a com- 
puter scientist who had an interest in biomedical computing, or by a 
physician or biologist who had taken a few computing courses. Indeed, 
in the past, most teaching—and research—has been undertaken by fac- 
ulty trained primarily in one of the fields and later drawn to the other. 
Today, however, schools have come to realize the need for professionals 
trained specifically at the interfaces among biomedicine, biomedical 
informatics, and related disciplines such as computer science, statistics, 
cognitive science, health economics, and medical ethics. 

This book outlines a first course for students training for careers in 
the biomedical informatics field. We specifically address the need for an 
educational experience in which computing and information-science 
concepts are synthesized with biomedical issues regarding research, 
training, and clinical practice. It is the integration of the related disci- 
plines that originally was lacking in the educational opportunities avail- 
able to students with career interests in biomedical informatics. Schools 
are establishing such courses and training programs in growing num- 
bers, but their efforts have been constrained by a lack of faculty who 
have a broad familiarity with the field and who can develop curricula for 
students of the health professions as well as of informatics itself. 

The increasing introduction of computing techniques into biomedi- 
cal environments requires that well-trained individuals be available not 
only to teach students but also to design, develop, select, and manage 
the biomedical-computing systems of tomorrow. There is a wide range 
of context-dependent computing issues that people can appreciate only 
by working on problems defined by the health care setting and its con- 
straints. The field’s development has been hampered because there are 
relatively few trained personnel to design research programs, to carry 
out the experimental and developmental activities, and to provide aca- 
demic leadership in biomedical informatics. A frequently cited problem 
is the difficulty a health professional (or a biologist) and a technically 
trained computer scientist experience when they try to communicate 
with one another. The vocabularies of the two fields are complex and 
have little overlap, and there is a process of acculturation to biomedicine 
that is difficult for computer scientists to appreciate through distant 
observation. Thus, interdisciplinary research and development projects 
are more likely to be successful when they are led by people who can 
effectively bridge the biomedical and computing fields. Such profession- 
als often can facilitate sensitive communication among program person- 
nel whose backgrounds and training differ substantially. 

Hospitals and health systems have begun to learn that they need such 
individuals, especially with the increasing implementation of, and 
dependence upon, EHRs and related clinical systems. The creation of a 
Chief Medical Information Officer (CMIO) has now become a common 
innovation. As the concept became popular, however, questions arose 
about how to identify and evaluate candidates for such key institutional 
roles. The need for some kind of suitable certification process became 
clear—one that would require individuals to demonstrate both formal 
training and the broad skills and knowledge that were required. Thus, 
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the American Medical Informatics Association (AMIA) and its mem- 
bers began to develop plans for a formal certification program. For phy- 
sicians, the most meaningful approach was to create a formal medical 
subspecialty in clinical informatics. Working with the American Board 
of Preventive Medicine and the parent organization, the American 
Board of Medical Specialties (ABMS), AMIA helped to obtain approval 
for a subspecialty board that would allow medical specialists, with 
board certification in any ABMS specialty (such as pediatrics, internal 
medicine, radiology, pathology, preventive medicine) to pursue subspe- 
cialty board certification in clinical informatics. This proposal was ulti- 
mately approved by the ABMS in 2011, and the board examination was 
first administered in 2013*. After a period during which currently active 
clinical informatics physician experts could sit for their clinical infor- 
matics boards, board eligibility now requires a formal fellowship in 
clinical informatics. This is similar to the fellowship requirement for 
other subspecialties such as cardiology, nephrology, and the like. Many 
health care institutions now offer formal clinical informatics fellowships 
for physicians who have completed a residency in one of the almost 30 
ABMS specialties. These individuals are now often turning to this vol- 
ume as a resource to help them to prepare for their board examinations. 

It is exciting to be working in a field that is maturing and that is hav- 
ing a beneficial effect on society. There is ample opportunity remaining 
for innovation as new technologies evolve and fundamental computing 
problems succumb to the creativity and hard work of our colleagues. In 
light of the increasing sophistication and specialization required in 
computer science in general, it is hardly surprising that a new discipline 
should arise at that field’s interface with biomedicine. This book is dedi- 
cated to clarifying the definition and to nurturing the effectiveness of 
that discipline: biomedical informatics. 


Edward H. Shortliffe 
New York, NY, USA 


James J. Cimino 
Birmingham, AL, USA 


Michael F. Chiang 
Bethesda, MD, USA 
June 2020 


2 AMIA is currently developing a Health Informatics Certification program (AHIC) 
for individuals who seek professional certification in health-related informatics but 
are not physicians or are otherwise not eligible to take the ABMS board certifica- 
tion exam. » https://www.amia.org/ahic (Accessed June 10, 2020). 
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colleagues Larry Fagan and Gio Wiederhold, and we decided to com- 
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informatics. As it turned out, none of us predicted the enormity of the 
task we were about to undertake. Our challenge was to create a multiau- 
thored textbook that captured the collective expertise of leaders in the 
field yet was cohesive in content and style. The concept for the book was 
first developed in 1982. We had begun to teach a course on computer 
applications in health care at Stanford’s School of Medicine and had 
quickly determined that there was no comprehensive introductory text 
on the subject. Despite several published collections of research descrip- 
tions and subject reviews, none had been developed to meet the needs of 
a rigorous introductory course. The thought of writing a textbook was 
daunting due to the diversity of topics. None of us felt that he was suf- 
ficiently expert in the full range of important subjects for us to write the 
book ourselves. Yet we wanted to avoid putting together a collection of 
disconnected chapters containing assorted subject reviews. Thus, we 
decided to solicit contributions from leaders in the pertinent fields but 
to provide organizational guidelines in advance for each chapter. We 
also urged contributors to avoid writing subject reviews but, instead, to 
focus on the key conceptual topics in their field and to pick a handful of 
examples to illustrate their didactic points. 

As the draft chapters began to come in, we realized that major edit- 
ing would be required if we were to achieve our goals of cohesiveness 
and a uniform orientation across all the chapters. We were thus delighted 
when, in 1987, Leslie Perreault, a graduate of our informatics training 
program, assumed responsibility for reworking the individual chapters 
to make an integral whole and for bringing the project to completion. 
The final product, published in 1990, was the result of many compro- 
mises, heavy editing, detailed rewriting, and numerous iterations. We 
were gratified by the positive response to the book when it finally 
appeared, and especially by the students of biomedical informatics who 
have often come to us at scientific meetings and told us about their 
appreciation of the book. 

As the 1990s progressed, however, we began to realize that, despite 
our emphasis on basic concepts in the field (rather than a survey of 
existing systems), the volume was beginning to show its age. A great deal 
had changed since the initial chapters were written, and it became clear 
that a new edition would be required. The original editors discussed the 
project and decided that we should redesign the book, solicit updated 
chapters, and publish a new edition. Leslie Perreault by this time was a 
busy Director at First Consulting Group in New York City and would 
not have as much time to devote to the project as she had when we did 
the first edition. With trepidation, in light of our knowledge of the work 
that would be involved, we embarked on the new project. 

As before, the chapter authors did a marvelous job, trying to meet 
our deadlines, putting up with editing changes that were designed to 
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bring a uniform style to the book, and contributing excellent chapters 
that nicely reflected the changes in the field during the preceding decade. 

No sooner had the second edition appeared in print in 2000 than we 
started to get inquiries about when the next update would appear. We 
began to realize that the maintenance of a textbook in a field such as 
biomedical informatics was nearly a constant, ongoing process. By this 
time I had moved to Columbia University and the initial group of edi- 
tors had largely disbanded to take on other responsibilities, with Leslie 
Perreault no longer available. Accordingly, as plans for a third edition 
began to take shape, my Columbia colleague Jim Cimino joined me as 
the new associate editor, whereas Drs. Fagan, Wiederhold, and Per- 
reault continued to be involved as chapter authors. Once again the 
authors did their best to try to meet our deadlines as the third edition 
took shape. This time we added several chapters, attempting to cover 
additional key topics that readers and authors had identified as being 
necessary enhancements to the earlier editions. We were once again 
extremely appreciative of all the authors’ commitment and for the excel- 
lence of their work on behalf of the book and the field. 

Predictably, it was only a short time after the publication of the third 
edition in 2006 that we began to get queries about a fourth edition. We 
resisted for a year or two, but it became clear that the third edition was 
becoming rapidly stale in some key areas and that there were new topics 
that were not in the book and needed to be added. With that in mind we, 
in consultation with Grant Weston from Springer’s offices in London, 
agreed to embark on a fourth edition. Progress was slowed by my pro- 
fessional moves (to Phoenix, Arizona, then Houston, Texas, and then 
back to New York) with a very busy 3-year stint as President and CEO 
of the American Medical Informatics Association. Similarly, Jim 
Cimino left Columbia to assume new responsibilities at the NIH Clini- 
cal Center in Bethesda, MD. With several new chapters in mind, and the 
need to change authors of some of the existing chapters due to retire- 
ments (this too will happen, even in a young field like informatics), we 
began working on the fourth edition, finally completing the effort with 
publication in early 2014. 

Now, seven years later, we are completing the fifth edition of the vol- 
ume. It was not long after the publication of the fourth edition that we 
began to get requests for a new edition that would include many of the 
new and emerging topics that had not made it into the 2014 publication. 
With the introduction of new chapters, major revisions to previous 
chapters, and some reordering of authors or introduction of new ones, 
we have attempted to assure that this new edition will fill the necessary 
gaps and engage our readers with its currency and relevance. As Jim 
Cimino (now directing the Informatics Institute at the University of 
Alabama in Birmingham) and I considered the development of this edi- 
tion, we realized that we were not getting any younger and it would be 
wise to craft a succession plan so that others could handle the inevitable 
requests for a sixth and subsequent editions. We were delighted when 
Michael Chiang agreed to join us as an associate editor, coauthoring 
three chapters and becoming fully involved in the book’s philosophy 
and the editing tasks involved. Michael was a postdoctoral informatics 
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trainee at Columbia when we were both there on the faculty. A well- 
known pediatric ophthalmologist, he is now balancing his clinical career 
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book into the future as Jim and I (both of whom view the book as a 
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Michael was named director of the National Eye Institute at NIH, 
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gist, researcher, and informatician. 

For this edition we owe particular gratitude to Elektra McDermott, 
our developmental editor, whose rigorous attention to detail has been 
crucial given the size and the complexity of the undertaking. At Springer 
we have been delighted to work once again with Grant Weston, Execu- 
tive Editor in their Medicine and Life Sciences division, who has been 
extremely supportive despite our missed deadlines. And I want to offer 
my sincere personal thanks to Jim Cimino, who has been a superb and 
talented collaborator in this effort for the last three editions. Without his 
hard work and expertise, we would still be struggling to complete the 
massive editing job associated with this now very long manuscript. 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= Why is information and knowledge 
management a central issue in biomedi- 
cal research, clinical practice, and pub- 
lic health? 

= What are integrated information man- 
agement environments, and how are 
they affecting the practice of medicine, 
the promotion of health, and biomedi- 
cal research? 

= What do we mean by the terms bio- 
medical informatics, medical computer 
science, medical computing, clinical 
informatics, nursing informatics, bioin- 
formatics, public health informatics, and 
health informatics? 

= What is translational research, why is it 
being heavily promoted and supported, 
how does it depend on translational bio- 
informatics and clinical research infor- 
matics, and how do these all relate to 
precision medicine? 

= Why should health professionals, life 
scientists, and students of the health 
professions learn about biomedical 
informatics concepts and informatics 
applications? 

= How has the development of mod- 
ern computing technologies and the 
Internet changed the nature of biomed- 
ical computing? 

= How is biomedical informatics related 
to clinical practice, public health, bio- 
medical engineering, molecular biology, 
decision science, information science, 
and computer science? 

= How does information in clinical medi- 
cine and health differ from information 
in the basic sciences? 

= How can changes in computer technol- 
ogy and the financing of health care 
influence the integration of biomedical 
computing into clinical practice? 


1.1 The Information Revolution 
Comes to Medicine 


After scientists had developed the first digital 
computers in the 1940s, society was told that 
these new machines would soon be serving 
routinely as memory devices, assisting with 
calculations and with information retrieval. 
Within the next decade, physicians and other 
health professionals had begun to hear about 
the dramatic effects that such technology 
would have on clinical practice. 

More than seven decades of remarkable 
progress in computing have followed those 
early predictions, and many of the original 
prophesies have come to pass. Stories regard- 
ing the “information revolution”, “artificial 
intelligence”, and “big data” fill our newspa- 
pers and popular magazines, and today’s chil- 
dren show an uncanny ability to make use of 
computers (including their handheld mobile 
versions) as routine tools for study, communi- 
cation, and entertainment. Similarly, clinical 
workstations have been available on hospital 
wards and in outpatient offices for decades, 
and in some settings have been supplanted by 
mobile tablets with wireless connectivity. 

Not long ago, the health care system was 
perceived as being slow to understand infor- 
mation technology and slow to exploit it for its 
unique practical and strategic functionalities. 
This is no longer the case. The enormous tech- 
nological advances of the last four decades— 
personal computers and graphical interfaces, 
laptop machines, new methods for human- 
computer interaction, innovations in mass 
storage of data (both locally and in the 
“cloud”), mobile devices, personal health- 
monitoring devices, the Internet, wireless com- 
munications, social media, and more—have all 
combined to make use of computers by health 
workers and biomedical scientists part of 
today’s routine. This new world is already 
upon us, but its greatest influence is yet to 
come as today’s prominent innovations such as 
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electronic health records and decision-support 
software are further refined. This book will 
teach you about our present resources and 
accomplishments, and about gaps that need to 
be addressed in the years ahead. 

When one considers today’s penetration of 
computers and communication into our daily 
lives, it is remarkable that the first personal 
computers were introduced as recently as the 
late 1970s; local area networking has been 
available only since the 1980s; the World Wide 
Web dates only to the early 1990s; and smart 
phones, social networking, tablet computers, 
wearable devices, and wireless communication 
are even more recent. This dizzying rate of 
change, combined with equally pervasive and 
revolutionary changes in almost all interna- 
tional health care systems, makes it difficult 
for public-health planners and health- 
institutional managers to try to deal with both 
issues at once. 

As new technologies have been introduced 
and adopted in health settings, unintended 
consequences have emerged, such as ransom- 
ware and other security challenges that can 
compromise the protection and privacy of 
patient data. Yet many observers now believe 
that rapid changes in both technology and 
health systems are inextricably related. We 
can see that planning for the new health care 
environments of the coming decades requires 
a deep understanding of the role that infor- 
mation technology is likely to play in those 
environments. 

What might that future hold for the typical 
practicing clinician? As we discuss in detail in 
> Chap. 14, no applied clinical computing 
topic is gaining more attention currently than 
is the issue of electronic health records 
(EHRs). Health care organizations have 
largely replaced their paper-based recording 
systems, recognizing that they need to have 
digital systems in place that create opportuni- 
ties to facilitate patient care that is safe and 
effective, to answer questions that are cru- 
cially important for strategic planning, to sup- 


port a better understanding of how they and 
their providers compare with other organiza- 
tions in their local or regional competitive 
environment, and to support reporting to 
regulatory agencies. 

In the past, administrative and financial 
data were the major elements required for 
planning, but in recent years comprehensive 
clinical data have also become important for 
institutional self-analysis and strategic plan- 
ning. Furthermore, the inefficiencies and frus- 
trations associated with the use of paper-based 
medical records are well accepted (Dick and 
Steen 1991 (Revised 1997)), especially when 
inadequate access to clinical information is 
one of the principal barriers that clinicians 
encounter when trying to increase their effi- 
ciency in order to meet productivity goals for 
their practices. 


1.1.1 Integrated Access to Clinical 
Information 


Encouraged by health information technology 
(HIT) vendors (and by the US government, as 
is discussed later), most health care institu- 
tions have or are developing integrated 
computer-based information-management 
environments. These underlie a clinical world 
in which computational tools assist not only 
with patient-care matters (e.g., reporting 
results of tests, allowing direct entry of orders 
or patient information by clinicians, facilitat- 
ing access to transcribed reports, and in some 
cases supporting telemedicine applications or 
decision-support functions) but also with 
administrative and financial topics (e.g., 
tracking of patients within the hospital, man- 
aging materials and inventory, supporting 
personnel functions, and managing the pay- 
roll), with research (e.g., analyzing the out- 
comes associated with treatments and 
procedures, performing quality assurance, 
supporting clinical trials, and implementing 
various treatment protocols), with access to 
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scholarly information (e.g., accessing digital 
libraries, supporting bibliographic search, and 
providing access to drug information data- 
bases), and even with office automation (e.g., 
providing access to spreadsheets and 
document-management software). The key 
idea, however, is that at the heart of the evolv- 
ing integrated environments lies an electronic 
health record that is intended to be accessible, 
confidential, secure, acceptable to clinicians 
and patients, and integrated with other types 
of useful information to assist in planning 
and problem solving. 


1.1.2 Today’s Electronic Health 
Record (EHR) Environment 


The traditional paper-based medical record is 
now recognized as being woefully inadequate 
for meeting the needs of modern medicine. It 
arose in the nineteenth century as a highly 
personalized “lab notebook” that clinicians 
could use to record their observations and 
plans so that they could be reminded of perti- 
nent details when they next saw the same 
patient. There were no regulatory require- 
ments, no assumptions that the record would 
be used to support communication among 
varied providers of care, and few data or test 
results to fill up the record’s pages. The record 
that met the needs of clinicians a century or so 
ago struggled mightily to adjust over the 
decades and to accommodate to new require- 
ments as health care and medicine changed. 
Today the inability of paper charts to serve 
the best interests of the patient, the clinician, 
and the health system is no longer questioned 
(see > Chaps. 14 and 16). 

Most organizations have found it challeng- 
ing (and expensive) to move to a paperless, 
electronic clinical record. This observation 
forces us to ask the following questions: 
“What is a health record in the modern world? 
Are the available products and systems well 
matched with the modern notions of a com- 
prehensive health record? Do they meet the 
needs of individual users as well as the health 
systems themselves? Are they efficient, easy to 
use, and smoothly integrated into clinical 
workflow? How should our concept of the 


comprehensive health record evolve in the 
future, as technology creates unprecedented 
opportunities for innovation?” 

The complexity associated with automat- 
ing clinical-care records is best appreciated if 
one analyzes the processes associated with the 
creation and use of such records rather than 
thinking of the record as a physical object 
(such as the traditional paper chart) that can 
be moved around as needed within the institu- 
tion. For example, on the input side 
(@ Fig. 1.1), an electronic version of the 
paper chart requires the integration of pro- 
cesses for data capture and for merging infor- 
mation from diverse sources. 

The contents of the paper record were tra- 
ditionally organized chronologically—often a 
severe limitation when a clinician sought to 
find a specific piece of information that could 
occur almost anywhere within the chart. To be 
useful, the electronic record system has to 
make it easy to access and display needed 
data, to analyze them, and to share them 
among colleagues and with secondary users 
of the record who are not involved in direct 
patient care (@ Fig. 1.2). Thus, the EHR, as 
an adaptation of the paper record, is best 
viewed not as an object, or a product, but 
rather as a set of processes that an organiza- 
tion puts into place, supported by technology 
(B Fig. 1.3). 

Implementing electronic records is inher- 
ently a systems-integration task. It accordingly 
requires a custom-tailored implementation at 
each institution, given the differences in exist- 
ing systems and practices that must be suit- 
ably integrated. Joint development and local 
adaptation are crucial, which implies that the 
institutions that purchase such systems must 
have local expertise that can oversee and facil- 
itate an effective implementation process, 
including elements of process re-engineering 
and cultural change that are inevitably 
involved. 

Experience has shown that clinicians are 
“horizontal” users of information technology 
(Greenes and Shortliffe 1990). Rather than 
becoming “power users” of a narrowly defined 
software package, they tend to seek broad 
functionality across a wide variety of systems 
and resources. Thus, routine use of comput- 
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O Fig.1.1 Inputs to the clinical-care record. The tradi- 
tional paper record was created by a variety of organiza- 
tional processes that captured varying types of 
information (notes regarding direct encounters between 
health professionals and patients, laboratory or radio- 


ers, and of EHRs, is most easily achieved 
when the computing environment offers a 
critical mass of functionality that makes the 
system both smoothly integrated with work- 
flow and useful for essentially every patient 
encounter. 

The arguments for automating clinical- 
care records are summarized in > Chaps. 2 
and 14 and in the now classic Institute of 
Medicine’s report on computer-based patient 
records (CPRs) (Dick and Steen 1991 (Revised 
1997).' One argument that warrants emphasis 
is the importance of the EHR in supporting 
clinical trials—experiments in which data 
from specific patient interactions are pooled 


1 The Institute of Medicine, part of the National 
Academy of Sciences, is now known as the National 
Academy of Medicine. 


logic results, reports of telephone calls or prescriptions, 
and data obtained directly from patients). The paper 
record thus was a merged collection of such data, gener- 
ally organized in chronological order 


and analyzed in order to learn about the safety 
and efficacy of new treatments or tests and to 
gain insight into disease processes that are not 
otherwise well understood. Medical research- 
ers were constrained in the past by clumsy 
methods for identifying patients who met 
inclusion criteria for clinical trials as well as 
acquiring the data needed for the trials, gener- 
ally relying on manual capture of information 
onto datasheets that were later transcribed 
into computer databases for statistical analy- 
sis (B Fig. 1.4). The approach was labor- 
intensive, fraught with opportunities for error, 
and added to the high costs associated with 
randomized prospective research protocols. 
The use of EHRs has offered many advan- 
tages to those carrying out clinical research 
(see > Chap. 27). Most obviously, it helps to 
eliminate the manual task of extracting data 
from charts or filling out specialized data- 
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O Fig. 1.2 Outputs from the clinical-care record. Once 
information was collected in the traditional paper chart, 
it needed to be provided to a wide variety of potential 
users of the information that it contained. These users 
included health professionals and the patients them- 
selves, as well as “secondary users” (represented here by 
the individuals in business suits) who had valid reasons 
for accessing the record but who were not involved with 


sheets. The data needed for a study can often 
be derived directly from the EHR, thus mak- 
ing much of what is required for research data 
collection simply a by-product of routine clin- 
ical record keeping (@ Fig. 1.5). Other advan- 
tages accrue as well. For example, the record 
environment can help to ensure compliance 
with a research protocol, pointing out to a cli- 
nician when a patient is eligible for a study or 
when the protocol for a study calls for a spe- 
cific management plan given the currently 
available data about that patient. We are also 
seeing the development of novel authoring 
environments for clinical trial protocols that 
can help to ensure that the data elements 
needed for the trial are compatible with the 
local EHR’s conventions for representing 
patient descriptors. 


Z 


direct patient care. Numerous providers are typically 
involved in a patient’s care, so the chart also served as a 
means for communicating among them. The traditional 
mechanisms for displaying, analyzing, and sharing 
information from such records resulted from a set of 
processes that often varied substantially across several 
patient-care settings and institutions 


Note that © Fig. 1.5 represents a study at a 
single institution and often for a limited subset 
of the patients who receive care there. Yet 
much research is carried out with very large 
numbers of patients, such as within a regional 
health care system, statewide, or nationally. 
Accordingly, the size of research datasets can 
get very large, but analyzing across them intro- 
duces challenges related to data exchange and 
the standardization of the ways in which indi- 
vidual data elements are defined, identified, or 
stored (see > Chap 8). Retrospective studies 
on data collected in the past typically cannot 
assume a prior standardization of the elements 
that will be needed, thereby requiring analyses 
that infer relationships among specific descrip- 
tors in different institutions represented in dif- 
ferent ways. When the number of data elements 
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O Fig. 1.3 Complex processes demanded of the record. 


As shown in Ø Figs. 1.1 and 1.2, the paper chart evolved to 
become the incarnation of a complex set of organizational 
processes, which both gathered information to be shared 
and then distributed that information to those who had 


is large, and the population being studied is 
also vast, the challenges are often described as 
“big data” analytics (James et al. 2013). 
Another theme in the changing world of 
health care is the increasing investment in the 
creation of standard order sets, clinical guide- 
lines, and clinical pathways (see > Chap. 24), 
generally in an effort to reduce practice vari- 
ability and to develop consensus approaches to 
recurring management problems. Several gov- 
ernment and professional organizations, as 
well as individual provider groups, have 
invested heavily in guideline development, 
often putting an emphasis on using clear evi- 
dence from the literature, rather than expert 
opinion alone, as the basis for the advice. 
Despite the success in creating such evidence- 
based guidelines, there is a growing recognition 
that we need better methods for delivering the 
decision logic to the point of care. Guidelines 


valid reasons for accessing it. Yet paper-based documents 
were severely limited in meeting the diverse requirements for 
data collection and information access that are implied by 
this diagram. These deficiencies accounted in large part for 
the effort to create today’s electronic health records 


that appear in monographs or journal articles 
tend to sit on shelves, unavailable when the 
knowledge they contain would be most valu- 
able to practitioners. Computer-based tools for 
implementing such guidelines, and integrating 
them with the EHR, present a means for mak- 
ing high-quality advice available in the routine 
clinical setting. Many organizations are 
accordingly integrating decision-support tools 
with their EHR systems (see » Chaps. 14 and 
24), and there are highly visible commercial 
efforts underway to provide computer-based 
diagnostic decision support to practitioners.” 
There are at least five major issues that 
have consistently constrained our efforts to 
build effective EHRs: (1) the need for stan- 


2 » https://ehrintelligence.com/news/top-clinical- 
decision-support-system-cdss-companies-by-ambu- 
latory-inpatient; » https://www.ibm.com/watson/ 
health/. (Accessed 5/29/19/). 
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Clinical trial design 
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O Fig. 1.4 Traditional data collection for clinical tri- 
als. Until the introduction of EHRs and similar systems, 
the gathering of research data for clinical studies was 
typically a manual task. Physicians who cared for 
patients enrolled in trials, or their research assistants, 
would be asked to fill out special datasheets for later 
transcription into computer databases. Alternatively, 
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data managers were often hired to abstract the relevant 
data from the paper chart. The trials were generally 
designed to define data elements that were required and 
the methods for analysis, but it was common for the pro- 
cess of collecting those data in a structured format to be 
left to manual processes at the point of patient care 
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O Fig. 1.5 Role of electronic health records (EHRs) in 
supporting clinical trials. With the introduction of EHR 
systems, the collection of much of the research data for 
clinical trials can become a by-product of the routine care 
of the patients. Research data may be analyzed directly 
from the clinical data repository, or a secondary research 
database may be created by downloading information from 
the online patient records. The manual processes in 


O Fig. 1.4 are thereby largely eliminated. In addition, the 
interaction of the physician with the EHR permits two-way 
communication, which can greatly improve the quality and 
efficiency of the clinical trial. Physicians can be reminded 
when their patients are eligible for an experimental proto- 
col, and the computer system can also remind the clinicians 
of the rules that are defined by the research protocol, 
thereby increasing compliance with the experimental plan 
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dards in the area of clinical terminology; (2) 
concerns regarding data privacy, confidential- 
ity, and security; (3) challenges in data entry 
by physicians; (4) difficulties associated with 
the integration of record systems with other 
information resources in the health care set- 
ting, and (5) designing and delivering systems 
that are efficient, acceptable to clinicians, and 
intuitive to use. The first of these issues is dis- 
cussed in detail in > Chap. 7, and privacy is 
one of the central topics in > Chap. 12. Issues 
of direct data entry by clinicians are discussed 
in > Chaps. 2 and 14 and throughout many 
other chapters as well. » Chapter 15 exam- 
ines the fourth topic, focusing on recent trends 
in networked data integration, and offers solu- 
tions for the ways in which the EHR can be 
better joined with other relevant information 
resources and clinical processes, especially 
within communities where patients may have 
records with multiple providers and health 
care systems (Yasnoff et al. 2013). Finally, 
issues of the interface between computers and 
clinicians (or other users), with a cognitive 
emphasis, are the subject of > Chap. 5. 


1.1.3 Anticipating the Future 
of Electronic Health Records 


One of the first instincts of software develop- 
ers is to create an electronic version of an 
object or process from the physical world. 
Some familiar notion provides the inspiration 
for a new software product. Once the software 
version has been developed, however, human 
ingenuity and creativity often lead to an evo- 
lution that extends the software version far 
beyond what was initially contemplated. The 
computer can thus facilitate paradigm shifts 
in how we think about such familiar concepts. 

Consider, for example, the remarkable dif- 
ference between today’s office automation 
software and the typewriter, which was the 
original inspiration for the development of 
“word processors”. Although the early word 
processors were designed largely to allow 
users to avoid retyping papers each time a 
minor change was made to a document, the 


document-management software of today 
bears little resemblance to a typewriter. 
Consider all the powerful desktop-publishing 
facilities, integration of figures, spelling cor- 
rection, grammar aids, “publishing” online, 
collaboration on individual documents by 
multiple users, etc. Similarly, today’s spread- 
sheet programs bear little resemblance to the 
tables of numbers that we once created on 
graph paper. To take an example from the 
financial world, consider automatic teller 
machines (ATMs) and their facilitation of 
today’s worldwide banking in ways that were 
never contemplated when the industry 
depended on human bank tellers. 

It is accordingly logical to ask what the 
health record will become after it has been 
effectively implemented on computer systems 
and new opportunities for its enhancement 
become increasingly clear to us. It is clear that 
EHRs a decade from now will be remarkably 
different from the antiquated paper folders 
that used to dominate our health care envi- 
ronments. We might similarly predict that the 
state of today’s EHR is roughly comparable 
to the status of commercial aviation in the 
1930s. By that time air travel had progressed 
substantially from the days of the Wright 
Brothers, and air travel was becoming com- 
mon. But 1930s air travel seems archaic by 
modern standards, and it is logical to assume 
that today’s EHRs, albeit much better than 
both paper records and the early computer- 
based systems of the 1960s and 1970s, will be 
greatly improved and further modernized in 
the decades ahead. 

If people had failed to use the early air- 
planes for travel, the quality and efficiency of 
airplanes and air travel would not have 
improved as they have. A similar point can be 
made about the importance of committing 
to the use of EHRs today, even though we 
know that they need to be much better in the 
future. We must also commit to assuring that 
those improvements are made, which sug- 
gests a dynamic interaction and interdepen- 
dency among the researchers who address 
limitations in EHRs and their underlying 
methods and philosophy, the EHR compa- 
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nies that currently exist or will arise in the 
future, and the users who identify require- 
ments and areas for improvement. These 
companies must look to creative researchers, 
both within their own companies and in aca- 
demia, who will forge the changes that will 
encourage EHR users to embrace and appre- 
ciate the technology much more than they 
often do today. 


1.2 Communications Technology 


and Health Data Integration 


An obvious opportunity for changing the role 
and functionality of clinical-care records in 
the digital age is the power and ubiquity of 
the Internet. The Internet began in 1968 as a 
U.S. research activity funded by the Advanced 
Research Projects Agency (ARPA) of the 
Department of Defense. Initially known as 
the ARPANET, the network began as a novel 
mechanism for allowing a handful of defense- 
related mainframe computers, located mostly 
at academic institutions or in the research 
facilities of military contractors, to share data 
files with each other and to provide remote 
access to computing power at other locations. 
The notion of electronic mail arose soon 
thereafter, and machine-to-machine electronic 
mail exchanges quickly became a major com- 
ponent of the network’s traffic. As the tech- 
nology matured, its value for nonmilitary 
research activities was recognized, and by 
1973 the first medically related research com- 
puter had been added to the network 
(Shortliffe 1998a, 2000). 

During the 1980s, the technology began 
to be developed in other parts of the world, 
and the National Science Foundation took 
over the task of running the principal high- 
speed backbone network in the United 
States. Hospitals, mostly academic centers, 
began to be connected to what had by then 
become known as the Internet, and in a 
major policy move it was decided to allow 
commercial organizations to join the net- 
work as well. By April 1995, the Internet in 
the United States had become a fully com- 


mercialized operation, no longer depending 
on the U.S. government to support even the 
major backbone connections. Today, the 
Internet is ubiquitous, worldwide, accessible 
through mobile wireless devices, and has 
provided the invisible but mandatory infra- 
structure for social, political, financial, sci- 
entific, corporate, and entertainment 
ventures. Many people point to the Internet 
as a superb example of the facilitating role 
of federal investment in promoting innova- 
tive technologies. The Internet is a major 
societal force that arguably would never 
have been created if the research and devel- 
opment, plus the coordinating activities, had 
been left to the private sector. 

The explosive growth of the Internet did 
not occur until the late 1990s, when the World 
Wide Web (which had been conceived initially 
by the physics community as a way of using 
the Internet to share preprints with photo- 
graphs and diagrams among researchers) was 
introduced and popularized. Navigating the 
Web is highly intuitive, requires no special 
training, and provides a mechanism for access 
to multimedia information that accounts for 
its remarkable growth as a worldwide phe- 
nomenon. It is also accessible by essentially all 
digital devices—computers, tablets, smart 
phones, and a plethora of personal monitors 
and “smart home” tools—which is a tribute to 
its design and its compatibility with newer 
networking technologies, such as Bluetooth 
and Wi-Fi. 

The societal impact of this communica- 
tions phenomenon cannot be overstated, 
especially given the international connectivity 
that has grown phenomenally in the past two 
decades. Countries that once were isolated 
from information that was important to citi- 
zens, ranging from consumers to scientists to 
those interested in political issues, are now 
finding new options for bringing timely infor- 
mation to the desktop machines and mobile 
devices of individuals with an Internet con- 
nection. 

There has in turn been a major upheaval 
in the telecommunications industry, with 
companies that used to be in different busi- 
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nesses (e.g., cable television, Internet services, 
and telephone) now finding that their activi- 
ties and technologies have merged. In the 
United States, legislation was passed in 1996 
to allow new competition to develop and new 
industries to emerge. We have subsequently 
seen the merging of technologies such as 
cable television, telephone, networking, and 
satellite communications. High-speed lines 
into homes and offices are widely available, 
wireless networking is ubiquitous, and inex- 
pensive mechanisms for connecting to the 
Internet without using conventional comput- 
ers (e.g., using cell phones or set-top boxes) 
have also emerged. The impact on everyone 
has been great and hence it is affecting the 
way that individuals seek health-related infor- 
mation while also enhancing how patients 
can gain access to their health care providers 
and to their clinical data. 

The Internet also has exhibited unin- 
tended consequences, especially in the world 
of social media, which has created opportuni- 
ties for promoting political unrest, social 
shaming, and dissemination of falsehoods. In 
the world of health care, the Internet has cre- 
ated opportunities for attacks on personal 
privacy, even while facilitating socially valu- 
able exchanges of data among institutions 
and individuals. Many of these practical, 
legal, and ethical challenges are the subject of 
> Chap. 12. 

Just as individual hospitals and health care 
systems have come to appreciate the impor- 
tance of integrating information from multi- 
ple clinical and administrative systems within 
their organizations (see » Chap. 16), health 
planners and governments now appreciate the 
need to develop integrated information 
resources that combine clinical and health 
data from multiple institutions within regions, 
and ultimately nationally (see » Chaps. 15 
and 18). As you will see, the Internet and the 
role of digital communications has therefore 
become a major part of modern medicine and 
health. Although this topic recurs in essen- 
tially every chapter in this book, we introduce 
it in the following sections because of its 
importance to modern technical issues and 
policy directions. 


1.2.1 A Model of Integrated 
Disease Surveillance? 


To emphasize the role that the nation’s net- 
working infrastructure is playing in integrat- 
ing clinical data and enhancing care delivery, 
consider one example of how disease surveil- 
lance, prevention, and care are increasingly 
being influenced by information and commu- 
nications technology. The goal is to create an 
information-management infrastructure that 
will allow all clinicians, regardless of practice 
setting (hospitals, emergency rooms, small 
offices, community clinics, military bases, 
multispecialty groups, etc.) to use EHRs in 
their practices both to assist in patient care 
and to provide patients with counsel on illness 
prevention. The full impact of this use of elec- 
tronic resources will occur when data from all 
such records are pooled in regional and 
national registries or surveillance databases 
(0 Fig. 1.6), mediated through secure con- 
nectivity with the Internet. The challenge, of 
course, is to find a way to integrate data from 
such diverse practice settings, especially since 
there are multiple vendors and system devel- 
opers active in the marketplace, competing to 
provide value-added capabilities that will 
excite and attract the practitioners for whom 
their EHR product is intended. 

The practical need to pool and integrate 
clinical data from such diverse resources and 
systems emphasizes the practical issues that 
need to be addressed in achieving such func- 
tionality and resources. Interestingly, most of 
the barriers are logistical, political, and finan- 
cial rather than technical in nature: 
= Encryption of data: Concerns regarding 

privacy and data protection require that 
Internet transmission of clinical 
information occur only if those data are 
encrypted, with an established mechanism 
for identifying and authenticating 
individuals before they are allowed to 
decrypt the information for surveillance or 
research use. 


3 This section is adapted from a discussion that origi- 
nally appeared in (Shortliffe and Sondik 2004). 


14 E. H. Shortliffe and M. F. Chiang 


Internet 
| Provider «>| EHR a 


| Provider ~——>|EHR 
Provider = EHR |< >) 


Regional and national registries 
and surveillance databases 


| Provider ~———| EHR 


Protocols and guidelines 
for standards of care 


| Provider |__| EHR 
f 


Different vendors 


O Fig. 1.6 A future vision of surveillance databases, in 
which clinical data are pooled in regional and national 
registries or repositories through a process of data sub- 
mission that occurs over the Internet (with attention to 
privacy and security concerns as discussed in the text). 


= Protection of stored clinical data: Even 
when data are stored within an institution, 
there are opportunities for attack over the 
Internet, which can be an affront to patient 
privacy or, equally seriously, an 
opportunity for installing malware within 
an institution, resulting in rogue uses of 
data or even a lockout of valid users from 
crucially important functions or data. 
Cybersecurity has accordingly become a 
major topic of concern for health care 
institutions and other practice settings.* 

= HIPAA-compliant policies: The privacy 
and security rules that resulted from the 
1996 Health Insurance Portability and 
Accountability Act (HIPAA) do not 
prohibit the pooling and use of such data, 
but they do lay down policy rules and 
technical security practices that must be 
part of the solution in achieving the vision 
we are discussing here. 

= Standards for data transmission and 
sharing. Sharing data over networks 
requires that all developers of EHRs and 
clinical databases adopt a single set of 


4 > https://www.theverge.com/2019/4/4/18293817/ 
cybersecurity-hospitals-health-care-scan-simulation 
(Accessed 5/29/19). 


When information is effectively gathered, pooled, and 
analyzed, there are significant opportunities for feeding 
back the results of derived insights to practitioners at 
the point of care. Thus the arrows indicate a bi- 
directional process. See also > Chap. 15 


standards for communicating and 
exchanging information. The major 
enabling standard for such sharing, 
Health Level 7 (HL7), was introduced 
decades ago and, after years of work, has 
been uniformly adopted, implemented, 
and utilized. However, a uniform 
“envelope” for digital communication, 
such as HL7, does not assure that the 
contents of such messages will be 
understood or standardized. The pooling 
and integration of data requires the 
adoption of standards for clinical 
terminology and potentially for the 
schemas used to store clinical information 
in databases. Thus true interoperability 
of such systems requires additional 
standards to be adopted, many of which 
are discussed in > Chap. 7. 

= Quality control and error checking: Any 
system for accumulating, analyzing, and 
utilizing clinical data from diverse sources 
must be complemented by a rigorous 
approach to quality control and error 
checking. It is crucial that users have faith 
in the accuracy and comprehensiveness of 
the data that are collected in such 
repositories, because policies, guidelines, 
and a variety of metrics can be derived 
over time from such information. 
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= Regional and national registries and 
surveillance databases: Any adoption of the 
modelin Ø Fig. 1.6 willrequiremechanisms 
for creating, funding, and maintaining the 
regionaland national databases or registries 
that are involved (see > Chap. 15). The 
growing amount of data that can be 
gathered in this way are naturally viewed as 
part of the “big data” problem that has 
characterized modern data science. The 
role of state and federal governments in 
gathering and curating such databases will 
need to be clarified, and the political issues 
addressed (including the concerns of some 
members of the populace that any 
government role in managing or analyzing 
their health data may have societal 
repercussions that threaten individual 
liberties, employability, and the like). 


With the establishment of registries and sur- 
veillance databases, and a robust system of 
Internet integration with EHRs, summary 
information can flow back to providers to 
enhance their decision making at the point of 
care (@ Fig. 1.6). This assumes standards that 
allow such information to be integrated into 
the vendor-supplied products that the clini- 
cians use in their practice settings. These may 
be EHRs or their order-entry components 
that clinicians use to specify the actions that 
they want to have taken for the treatment or 
management of their patients (see » Chaps. 
14 and 16). Furthermore, as is shown in 
O Fig. 1.6, the databases can help to support 
the creation of evidence-based guidelines, or 
clinical research protocols, which can be deliv- 
ered to practitioners through the feedback 
process. Thus one should envision a day when 
clinicians, at the point of care, will receive 
integrated, non-dogmatic, supportive infor- 
mation regarding: 
= Recommended steps for health promotion 
and disease prevention 
= Detection of syndromes or problems, 
either in their community or more widely 
= Trends and patterns of public health 
Importance, a capability emphasized by 
the need for rapidly changing data on cases 
and deaths during the COVID-19 
pandemic in 2020. 


= Clinical guidelines, adapted for execution 
and integration into patient-specific 
decision support rather than simply 
provided as text documents 

= Opportunities for distributed (community- 
based) clinical research, whereby patients 
are enrolled in clinical trials and protocol 
guidelines are in turn integrated with the 


clinicians’ EHR to support protocol- 
compliant management of enrolled 
patients 


1.2.2 The Goal: A Learning Health 
System 


We have been stressing the cyclical role of 
information—(its capture, organization, inter- 
pretation, and ultimate use. You can easily 
understand the small cycle that is implied: 
patient-specific data and plans entered into 
an EHR and subsequently made available to 
the same practitioner or others who are 
involved in that patient’s care (@ Fig. 1.7). 
Although this view is a powerful contributor 
to improved data management in the care of 
patients, it fails to include a larger view of the 
societal value of the information that is con- 
tained in clinical-care records. In fact, such 
straightforward use of EHRs for direct 
patient care would not have met some of the 
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O Fig. 1.7 There is a limited view of the role of EHRs 
that sees them as intended largely to support the ongo- 
ing care of the patient whose clinical data are stored in 
the record 
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O Fig. 1.8 The ultimate goal is to create a cycle of 
information flow, whereby data from local distributed 
electronic health records (EHRs) and their associated 
clinical datasets are routinely and effortlessly submitted 
to registries and research databases. The resulting new 


requirements that the US government speci- 
fied after 2009 when determining eligibility 
for payment of incentives to clinicians or 
hospitals who implemented EHRs (see the 
discussion of the government HITECH pro- 
gram in > Sect. 1.3). 

Consider, instead, an expanded view of 
the health surveillance model introduced in 
> Sect. 1.2.1 (ØB Fig. 1.8). Beginning at the 
left of the diagram, clinicians caring for 
patients use electronic health records, both to 
record their observations and to gain access to 
information about the patient. Information 
from these records is then stored in local 
patient-care clinical databases and forwarded 
automatically to regional and national regis- 
tries as well as to research databases that can 
support retrospective studies (see > Chap. 15) 
or formal institutional or community-based 
clinical trials (see > Chap. 27). The analyzed 
information from institutional datasets, regis- 
tries and research studies can in turn be used 
to develop standards for prevention and treat- 
ment, with major guidance from biomedical 
research. Researchers can draw information 
either directly from the health records or from 
the pooled data in registries. The standards 
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knowledge then can feed back to practitioners at the 
point of care, using a variety of computer-supported 
decision-support delivery mechanisms. This cycle of 
new knowledge, driven by experience, and fed back to 
clinicians, has been dubbed a “learning health system” 


for treatment in turn can be translated into 
protocols, guidelines, and educational materi- 
als. This new knowledge and decision-support 
functionality can then be delivered over the 
network back to the clinicians so that the 
information informs patient care, where it is 
integrated seamlessly with EHRs and order- 
entry systems. 

This notion of a system that allows us to 
learn from what we do, unlocking the experi- 
ence that has traditionally been stored in 
unusable form in paper charts, is gaining wide 
attention now that we can envision an inter- 
connected community of clinicians and insti- 
tutions, building digital data resources using 
EHRs. The concept has been dubbed a learn- 
ing health system and is an ongoing subject 
of study by the National Academy of 
Medicine (Daley 2013), which has published 
a series of reports on the topic.’ It is also the 
organizing conceptual framework for a 
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O Fig.1.9 Today the learning health system is increas- 
ingly embracing new forms of massive health-related 
data, often from outside the clinical care setting and 


recently created department at the University 
of Michigan Medical School and for a new 
scientific journal.’ 

Although the learning health system con- 
cept of @ Fig. 1.8 may at first seem expansive 
and all-inclusive, in recent years we have 
learned that there are other important inputs 
to the health care environment and these can 
have important implications for what we learn 
by analyzing what both patients and healthy 
individuals do. Some of these data sources are 
immense and are in line with the recent inter- 
est in “big data” analytics (@ Fig. 1.9). 
Consider, for example, the analysis of huge 
datasets associated with full human genome 
specifications for individuals and populations. 
Another approach for gathering massive 
amounts of relevant health-related data is to 


6 > https://medicine.umich.edu/dept/learning-health- 
sciences (Accessed 05/03/2020) 

7 » https://onlinelibrary.wiley.com/journal/23796146 
(Accessed 05/03/2020) 


derived from population activities that reflect individu- 
als’ health, activities, and attitudes 


monitor the behavior of individuals as they 
use online information resources, searching 
for health-related information. Social media 
exchanges (e.g., Twitter, Facebook) have also 
been used to extract health-related informa- 
tion, such as complaints that suggest early 
stages of communicable diseases or expressed 
attitudes towards diseases and treatment. The 
explosive adoption of health monitoring 
devices (e.g., step counters, exercise analyzers, 
cardiac or sleep monitors) has also offered a 
useful source of large-scale information that 
is only beginning to be merged with other 
data in our learning health system. 


1.2.3 Implications of the Internet 
for Patients 


With the penetration of the Internet, patients, 
as well as healthy individuals, have turned to 
the Internet for health information. It is a rare 
North American physician who has not 
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encountered a patient who comes to an 
appointment armed with a question, or a 
stack of printouts, that arose due to medically 
related searches on the net. The companies 
that provide search engines for the Internet 
report that health-related sites are among the 
most popular ones being explored by consum- 
ers. As a result, physicians and other care pro- 
viders have learned that they must be prepared 
to deal with information that patients discover 
on the net and bring with them when they 
seek care from clinicians. Some of the infor- 
mation is timely and excellent; in this sense, 
physicians can often learn about innovations 
from their patients and need to be open to the 
kinds of questions that this enhanced access 
to information will generate from patients in 
their practices. 

On the other hand, much of the health 
information on the Web lacks peer review or is 
purely anecdotal. People who lack medical 
training can be misled by such information, 
just as they have been poorly served in the 
past by printed information in books and 
magazines dealing with fad treatments from 
anecdotal sources. This also creates challenges 
for health care providers, who often feel pres- 
sured to handle more issues in less time due to 
economic pressures. In addition, some sites 
provide personalized advice, sometimes for a 
fee, with all the attendant concerns about the 
quality of the suggestions and the ability to 
give valid advice based on an electronic-mail 
or Web-based interaction. 

In a positive light, communications tech- 
nologies offer clinicians creative ways to inter- 
act with their patients and to provide higher 
quality care. Years ago, medicine adopted the 
telephone as a standard vehicle for facilitating 
patient care, and we now take this kind of 
interaction with patients for granted. If we 
extend the audio channel to include our visual 
sense as well, typically relying on the Internet 
as our communication mechanism, the notion 
of telemedicine emerges (see >» Chap. 20). 
This notion of “medicine at a distance” arose 
early in the twentieth century (see @ Fig. 1.10), 
but the technology was too limited for much 
penetration of the idea beyond telephone con- 
versations until the last 30-40 years. The use 
of telemedicine has subsequently grown rap- 


idly, and there are settings in which it is already 
proving to be successful and cost-effective 
(e.g., rural care, international medicine, tele- 
radiology, and video-based care of patients in 
prisons). Similarly, there are now a large num- 
ber of apps (designed for smart phones, tab- 
lets, or desktop machines) that offer 
specialized medical care or advice or assist 
with health data management and communi- 
cation with providers and support groups (see 
> Chaps. 11 and 20). 


1.2.4 Requirements for Achieving 
the Vision 


Efforts that continue to push the state of the 
art in Internet technology all have significant 
implications for the future of health care 
delivery in general and of EHRs and their 
integration in particular (Shortliffe 1998b, 
2000). But in addition to increasing speed, 
reliability, security, and availability of the 
Internet, there are many other areas that need 
attention if the vision of a learning health sys- 
tem is to be achieved. 


1.2.4.1 Education and Training 


There is a difference between computer liter- 
acy (familiarity with computers and their 
routine uses in our society) and knowledge 
of the role that computing and communica- 
tions technology can and should play in our 
health system. We need to do a better job of 
training future clinicians in the latter area. 
Otherwise we will leave them poorly equipped 
for the challenges and opportunities they will 
face in the rapidly changing practice environ- 
ments that surround them (Shortliffe 2010). 
Not only do they need to feel comfortable 
with the technology itself, but they need to 
understand the profound effect that it has 
had on the practice of medicine—with many 
more changes to come. Medicine, and other 
health professions, are being asked to adapt 
in ways that were not envisioned even a 
decade or two ago. Not all individuals 
embrace such change, but younger clinicians, 
who have grown up with technology in 
almost all aspects of their lives, have high 
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sion was invented, creative observers were suggesting 
how doctors and patients could communicate using 


expectations for how digital systems and 
tools should enhance their professional expe- 
rience. What is even more challenging, per- 
haps, is that assumptions that they have 
made about the field they have entered may 
no longer be valid in the coming years, as 
some skills are no longer required and new 
requirements are viewed as dramatically dif- 
ferent from what health professionals have 
had to know in the past. 


advanced technologies. This 1924 example is from the 
cover of a popular magazine and envisions video 
enhancements to radio. (Source: “Radio News” 1924) 


Furthermore, in addition to the implica- 
tions for education of health professionals 
about computer-related topics, much of the 
future vision we have proposed here can be 
achieved only if educational institutions pro- 
duce a cadre of talented individuals who are 
highly skilled in computing and communica- 
tions technology but also have a deep under- 
standing of the biomedical milieu and of the 
needs of practitioners and other health work- 
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ers. Computer science training alone is not 
adequate. Fortunately, there are increasing 
numbers of formal training programs in what 
has become known as biomedical informatics 
(see > Sect. 1.4) that provide custom-tailored 
educational opportunities. Many of the train- 
ees are life science researchers, physicians, 
nurses, pharmacists, and other health profes- 
sionals who see the career opportunities and 
challenges at the intersections of biomedicine, 
information science, computer science, deci- 
sion science, data science, cognitive science, 
and communications technologies. As has 
been clear for three decades (Greenes and 
Shortliffe 1990), however, the demand for 
such individuals far outstrips the supply, both 
for academic and industrial career pathways.® ° 
We need more training programs,!" expansion 
of those that already exist, plus support for 
junior faculty in health science schools who 
may wish to pursue additional training in this 
area. 


1.2.4.2 Organizational 
and Management Change 

Second, as implied above, there needs to be a 
greater understanding among health care 
leaders regarding the role of specialized multi- 
disciplinary expertise in successful clinical 
systems implementation. The health care sys- 
tem provides some of the most complex orga- 
nizational structures in society (Begun et al. 
2003), and it is simplistic to assume that off- 
the-shelf products will be smoothly intro- 
duced into a new institution without major 
analysis, redesign, and cooperative joint- 
development efforts. Underinvestment and a 


8 >» https://www.hcinnovationgroup.com/policy- 
value-based-care/staffing-professional-development/ 
news/13024360/report-health-informatics-labor- 
market-lags-behind-demand-for-workers (Accessed 
5/30/2019); > https://www.bestvalueschools.com/ 
faq/job-outlook-health-informatics-graduates/ 
(Accessed 5/30/2019). 

9 » https://www.burning-glass.com/wp-content/ 
uploads/BG-Health_Informatics_2014.pdf 
(Accessed 5/30/2019). 

10 A directory of some existing training programs is 
available at » http://www.amia.org/education/pro- 
grams-and-courses (Accessed 5/30/19). 


failure to understand the requirements for 
process reengineering as part of software 
implementation, as well as problems with 
technical leadership and planning, account 
for many of the frustrating experiences that 
health care organizations report in their 
efforts to use computers more effectively in 
support of patient care and provider produc- 
tivity. 

The notion of a learning health system 
described previously is meant to motivate 
your enthusiasm for what lies ahead and to 
suggest the topics that need to be addressed in 
a book such as this one. Essentially all of the 
following chapters touch on some aspect of 
the vision of integrated systems that extend 
beyond single institutions. Before embarking 
on these topics, however, we must emphasize 
two points. First, the cyclical creation of new 
knowledge in a learning health care system 
will become reality only if individual hospi- 
tals, academic medical centers, and national 
coordinating bodies work together to provide 
the standards, infrastructure, and resources 
that are necessary. No individual system 
developer, vendor, or administrator can man- 
date the standards for connectivity, data pool- 
ing, and data sharing implied by a learning 
health care system. A national initiative of 
cooperative planning and implementation for 
computing and communications resources 
within and among institutions and clinics is 
required before practitioners will have routine 
access to the information that they need (see 
> Chap. 15). A major federal incentive pro- 
gram for EHR implementation was a first step 
in this direction (see ® Sect. 1.3). The criteria 
that are required for successful EHR imple- 
mentation are sensitive to the need for data 
integration, public-health support, and a 
learning health system. 

Second, although our presentation of the 
learning health system notion has focused on 
the clinician’s view of integrated information 
access, other workers in the field have similar 
needs that can be addressed in similar ways. 
The academic research community has 
already developed and made use of much of 
the technology that needs to be coalesced if 
the clinical user is to have similar access to 
data and information. There is also the 
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patient’s view, which must be considered in 
the notion of patient-centered health care that 
is now broadly accepted and encouraged 
(Ozkaynak et al. 2013). 


1.3 The US Government Steps In 
During the early decades of the evolution of 
clinical information systems for use in hospi- 
tals, patient care, and public health, the major 
role of government was in supporting the 
research enterprise as new methods were 
developed, tested, and formally evaluated. 
The topic was seldom mentioned by the 
nation’s leaders, however, even during the 
1990s when the White House was viewed as 
being especially tech savvy. It was accordingly 
remarkable when, in the President’s State of 
the Union address in 2004 (and in each of the 
following years of his administration), 
President Bush called for universal implemen- 
tation of electronic health records within 
10 years. The Secretary of Health and Human 
Services, Tommy Thompson, was similarly 
supportive and, in May 2004, created an entity 
intended to support the expansion of the use 
of EHRs—the Office of the National 
Coordinator for Health Information 
Technology (initially referred to by the full 
acronym ONCHIT, but later shortened sim- 
ply to ONC). 

There was initially limited budget for ONC, 
although the organization served as a conven- 
ing body for EHR-related planning efforts and 
the National Health Information Infrastruc- 
ture (see > Chaps. 14, 15 and 29). The topic of 
EHRs subsequently became a talking point for 
both major candidates during the Presidential 
election in 2008, with strong bipartisan sup- 
port. Then, in early 2009, Congress enacted 
the American Recovery and Reinvestment Act 
(ARRA), also known as the economic “Stimu- 
lus Bill”. One portion of that legislation was 
known as the Health Information Technology 
for Economic and Clinical Health (HITECH) 
Act. It was this portion of the bill that pro- 
vided significant fiscal incentives for health 
systems, hospitals, and providers to implement 
EHRs in their practices and eventual financial 
penalties for lack of implementation. Such 
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payments were made available, however, only 
when eligible organizations or individual prac- 
titioners implemented EHRs that were “certi- 
fied” as meeting minimal standards and when 
they could document that they were making 
“meaningful use” of those systems. You will 
see references to such certification and mean- 
ingful use criteria in other chapters in this 
volume. 

This volume also offers a discussion of 
HIT policy and the federal government in 
> Chap. 29. Although the process of EHR 
implementation is approaching completion in 
the US, both in health systems and practices, 
the current status is largely due to this legisla- 
tive program: because of the federal stimulus 
package, large numbers of hospitals, systems, 
and practitioners invested in EHRs and incor- 
porated them into their practices. Furthermore, 
the demand for workers skilled in health infor- 
mation technology grew much more rapidly 
than did the general job market, even within 
health care (@ Fig. 1.11). It is a remarkable 
example of how government policy and invest- 
ment can stimulate major transitions in sys- 
tems such as health care, where many observers 
had previously felt that progress had been 
unacceptably slow (Shortliffe 2005). 


1.4 Defining Biomedical 


Informatics and Related 
Disciplines 


With the previous sections of this chapter as 
background, let us now consider the scientific 
discipline that is the subject of this volume 
and has led to the development of many of 
the functionalities that need to be brought 
together in the integrated biomedical- 
computing environment of the future. The 
remainder of this chapter deals with biomedi- 
cal informatics as a field and with biomedical 
and health information as a subject of study. 
It provides additional background needed to 
understand many of the subsequent chapters 
in this book. 

Reference to the use of computers in bio- 
medicine evokes different images depending 
on the nature of one’s involvement in the field. 
To a hospital administrator, it might suggest 
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the maintenance of clinical-care records using 
computers; to a decision scientist, it might 
mean the assistance by computers in disease 
diagnosis; to a basic scientist, it might mean 
the use of computers for maintaining, retriev- 
ing, and analyzing gene-sequencing informa- 
tion. Many physicians immediately think of 
office-practice tools for tasks such as patient 
billing or appointment scheduling, and of 
electronic health record systems for clinical 
documentation. Nurses often think of 
computer-based tools for charting the care 
that they deliver, or decision-support tools 
that assist in applying the most current 
patient-care guidelines. The field includes 
study of all these activities and a great many 
others too. More importantly, it includes the 
consideration of various external factors that 
affect the biomedical setting. Unless you keep 
in mind these surrounding factors, it may be 
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difficult to understand how biomedical com- 
puting can help us to tie together the diverse 
aspects of health care and its delivery. 

To achieve a unified perspective, we might 
consider four related topics: (1) the concept of 
biomedical information (why it is important 
in biological research and clinical practice and 
why we might want to use computers to pro- 
cess it); (2) the structural features of medicine, 
including all those subtopics to which com- 
puters might be applied; (3) the importance of 
evidence-based knowledge of biomedical and 
health topics, including its derivation and 
proper management and use; and (4) the 
applications of computers and communica- 
tion methods in biomedicine and the scientific 
issues that underlie such efforts. We mention 
the first two topics briefly in this and the next 
chapter, and we provide references in the 
Suggested Readings section for readers who 


Biomedical Informatics: The Science and the Pragmatics 


wish to learn more. The third topic, knowl- 
edge to support effective decision making in 
support of human health, is intrinsic to this 
book and occurs in various forms in essen- 
tially every chapter. The fourth topic, how- 
ever, is the principal subject of this book. 
Computers have captured the imagination 
(and attention) of our society. Today’s younger 
individuals have grown up in a world in which 
computers are ubiquitous and useful. Because 
the computer as a machine is exciting, people 
may pay a disproportionate amount of atten- 
tion to it as such—at the expense of consider- 
ing what the computer can do given the 
numbers, concepts, ideas, and cognitive under- 
pinnings of fields such as medicine, health, and 
biomedical research. Computer scientists, phi- 
losophers, psychologists, and other scholars 
increasingly consider such matters as the 
nature of information and knowledge and how 
human beings process such concepts. These 
investigations have been given a sense of time- 
liness (if not urgency) by the simple existence 
of the computer. The cognitive activities of cli- 
nicians in practice probably have received more 
attention over the past three or four decades 
than in all previous history (see > Chap. 4). 
Again, the existence of the computer and the 
possibilities of its extending a clinician’s cogni- 
tive powers have motivated many of these 
studies. To develop computer-based tools to 
assist with decisions, we must understand 
more clearly such human processes as diagno- 
sis, therapy planning, decision making, and 
problem solving in medicine. We must also 
understand how personal and cultural beliefs 
affect the way in which information is inter- 
preted and decisions are ultimately made. 


1.4.1 Terminology 


Although, starting in the 1960s, a growing 
number of individuals conducting serious 
biomedical research or undertaking clinical 
practice had access to a computer system, 
there was initial uncertainty about what name 
should be used for the biomedical application 
of computer science concepts. The name com- 
puter science was itself new in 1960 and was 
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only vaguely defined. Even today, the term 
computer science is used more as a matter of 
convention than as an explanation of the 
field’s scientific content. 

In the 1970s we began to use the phrase 
medical computer science to refer to the sub- 
division of computer science that applies the 
methods of the larger field to medical topics. 
As you will see, however, medicine has pro- 
vided a rich area for computer science 
research, and several basic computing insights 
and methodologies have been derived from 
applied medical-computing research. 

The term information science, which is 
occasionally used in conjunction with com- 
puter science, originated in the field of library 
science and is used to refer, somewhat gener- 
ally, to the broad range of issues related to the 
management of both paper-based and elec- 
tronically stored information. Much of what 
information science originally set out to be is 
now drawing evolving interest under the name 
cognitive science. 

Information theory, in contrast, was first 
developed by scientists concerned about the 
physics of communication; it has evolved into 
what may be viewed as a branch of mathemat- 
ics. The results scientists have obtained with 
information theory have illuminated many 
processes in communications technology, but 
they have had little effect on our understand- 
ing of human information processing. 

The terms biomedical computing or bio- 
computation have been used for a number of 
years. They are non-descriptive and neutral, 
implying only that computers are employed 
for some purpose in biology or medicine. 
They are often associated with bioengineering 
applications of computers, however, in which 
the devices are viewed more as tools for a bio- 
engineering application than as a primary 
focus of research. 

In the 1970s, inspired by the French term 
for computer science (informatique), the 
English-speaking community began to use the 
term medical informatics. Those in the field 
were attracted by the word’s emphasis on 
information, which they saw as more central to 
the field than the computer itself, and it gained 
momentum as a term for the discipline, espe- 
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cially in Europe, during the 1980s. The term is 
broader than medical computing (it includes 
such topics as medical statistics, record keep- 
ing, and the study of the nature of medical 
information itself) and deemphasizes the 
computer while focusing instead on the nature 
of the field to which computations are applied. 
Because the term informatics became widely 
accepted in the United States only in the late 
1980s, medical information science was also 
used earlier in North America; this term, 
however, may be confused with library sci- 
ence, and it does not capture the broader 
implications of the European term. As a 
result, the name medical informatics appeared 
by the late 1980s to have become the preferred 
term, even in the United States. Indeed, this is 
the name of the field that we used in the first 
two editions of this textbook (published in 
1990 and 2000), and it is still sometimes used 
in professional, industrial, and academic set- 
tings. However, many observers expressed 
concern that the adjective “medical” is too 
focused on physicians and disease, failing to 
appreciate the relevance of this discipline to 
other health and life-science professionals and 
to health promotion and disease prevention. 
Thus, the term health informatics, or health 
care informatics, gained some popularity, 
even though it has the disadvantage of tend- 
ing to exclude applications to biomedical 
research (> Chaps. 9 and 26) and, as we shall 
argue shortly, it tends to focus the field’s name 
on application domains (clinical care, public 
health, and prevention) rather than the basic 
discipline and its broad range of applicability. 

Applications of informatics methods in 
biology and genetics exploded during the 
1990s due to the human genome project!! and 
the growing recognition that modern life- 
science research was no longer possible with- 
out computational support and analysis (see 
> Chaps. 9 and 26). By the late 1990s, the use 
of informatics methods in such work had 
become widely known as bioinformatics and 
the director of the National Institutes of 
Health (NIH) appointed an advisory group 
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called the Working Group on Biomedical 
Computing. In June 1999, the group provided 
a report!* recommending that the NIH under- 
take an initiative called the Biomedical 
Information Science and Technology Initiative 
(BISTI). With the subsequent creation of 
another NIH organization called the 
Bioinformatics Working Group, the visibility 
of informatics applications in biology was 
greatly enhanced. Today bioinformatics is a 
major area of activity at the NIH" and in 
many universities and biotechnology compa- 
nies around the world. The explosive growth 
of this field, however, has added to the confu- 
sion regarding the naming conventions we 
have been discussing. In addition, the rela- 
tionship between medical informatics and bio- 
informatics became unclear. As a result, in an 
effort to be more inclusive and to embrace the 
biological applications with which many med- 
ical informatics groups had already been 
involved, the name medical informatics gradu- 
ally gave way to biomedical informatics 
(BMI). Several academic groups have changed 
their names, and a major medical informatics 
journal (Computers and Biomedical Research, 
first published in 1967) was reborn in 2001 as 
The Journal of Biomedical Informatics.'4 
Despite this convoluted naming history, 
we believe that the broad range of issues in 
biomedical information management does 
require an appropriate name and, beginning 
with the third edition of this book (2006), we 
used the term biomedical informatics for this 
purpose. It has become the most widely 
accepted term for the core discipline and 
should be viewed as encompassing broadly all 
areas of application in health, clinical prac- 
tice, and biomedical research. When we speak 
specifically about computers and their use 
within biomedical informatics activities, we 
use the terms biomedical computer science 
(for the methodologic issues) or biomedical 
computing (to describe the activity itself). 


12 Available at » https://acd.od.nih.gov/documents/ 
reports/060399_Biomed_Computing_WG_RPT. 
htm (Accessed 5/31/2019). 

13 See» http://www.bisti.nih.gov/. (Accessed 5/31/2019). 

14 > http://www.journals.elsevier.com/journal-of-bio- 
medical-informatics (Accessed 5/30/19). 
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Note, however, that biomedical informatics 
has many other component sciences in addi- 
tion to computer science. These include the 
decision sciences, statistics, cognitive science, 
data science, information science, and even 
management sciences. We return to this point 
shortly when we discuss the basic versus 
applied nature of the field when it is viewed as 
a basic research discipline. 

Although labels such as these are arbitrary, 
they are by no means insignificant. In the case 
of new fields of endeavor or branches of sci- 
ence, they are important both in designating 
the field and in defining or restricting its con- 
tents. The most distinctive feature of the mod- 
ern computer is the generality of its application. 
The nearly unlimited range of computer uses 
complicates the business of naming the field. 
As a result, the nature of computer science is 
perhaps better illustrated by examples than by 
attempts at formal definition. Much of this 
book presents examples that do just this for 
biomedical informatics as well. 

The American Medical Informatics 
Association (AMIA), which was founded in 
the late 1980s under the former name for the 


Box 1.1: Definition of Biomedical 
Informatics 

Biomedical informatics (BMI) is the inter- 
disciplinary field that studies and pursues 
the effective uses of biomedical data, infor- 
mation, and knowledge for scientific inquiry, 
problem solving, and decision making, 
driven by efforts to improve human health. 

Scope and breadth of discipline: BMI 
investigates and supports reasoning, mod- 
eling, simulation, experimentation, and 
translation across the spectrum from mol- 
ecules to individuals and to populations, 
from biological to social systems, bridging 
basic and clinical research and practice 
and the health care enterprise. 

Theory and methodology: BMI devel- 
ops, studies, and applies theories, methods, 
and processes for the generation, storage, 
retrieval, use, management, and sharing of 
biomedical data, information, and knowl- 
edge. 
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discipline, has recognized the confusion 
regarding the field and its definition.!> They 
accordingly appointed a working group to 
develop a formal definition of the field and to 
specify the core competencies that need to be 
acquired by students seeking graduate training 
in the discipline. The resulting definition, pub- 
lished in AMIA’s journal and approved by the 
full board of the organization, identifies the 
focus of the field in a simple sentence and then 
adds four clarifying corollaries that refine the 
definition and the field’s scope and content 
(> Box 1.1). We adopt this definition, which is 
very similar to the one we offered in previous 
editions of this text. It acknowledges that the 
emergence of biomedical informatics as a new 
discipline is due in large part to rapid advances 
in computing and communications technol- 
ogy, to an increasing awareness that the knowl- 
edge base of biomedicine is essentially 
unmanageable by traditional paper-based 
methods, and to a growing conviction that the 
process of informed decision making is as 
important to modern biomedicine as is the col- 
lection of facts on which clinical decisions or 
research plans are made. 


Technological approach: BMI builds 
on and contributes to computer, telecom- 
munication, and information sciences and 
technologies, emphasizing their applica- 
tion in biomedicine. 

Human and social context: BMI, rec- 
ognizing that people are the ultimate users 
of biomedical information, draws upon 
the social and behavioral sciences to 
inform the design and evaluation of tech- 
nical solutions, policies, and the evolution 
of economic, ethical, social, educational, 
and organizational systems. 

Reproduced with permission from 
(Kulikowski et al. 2012) © Oxford Univer- 


sity Press, 2012. 


15 » https://www.amia.org/about-amia/science-infor- 
matics (Accessed 5//27/19). 
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1.4.2 Historical Perspective 


The modern digital computer grew out of 
developments in the United States and abroad 
during World War II, and general-purpose 
computers began to appear in the market- 
place by the mid-1950s (@ Fig. 1.12). 
Speculation about what might be done with 
such machines (if they should ever become 
reliable) had, however, begun much earlier. 
Scholars, at least as far back as the Middle 
Ages, often had raised the question of whether 
human reasoning might be explained in terms 
of formal or algorithmic processes. Gottfried 
Wilhelm von Leibnitz, a seventeenth-century 
German philosopher and mathematician, 
tried to develop a calculus that could be used 
to simulate human reasoning. The notion of a 
“logic engine” was subsequently worked out 
by Charles Babbage in the mid nineteenth 
century. 

The first practical application of auto- 
matic computing relevant to medicine was 
Herman Hollerith’s development of a 
punched-card data-processing system for the 
1890 U.S. census (@ Fig. 1.13). His methods 
were soon adapted to epidemiologic and pub- 
lic health surveys, initiating the era of electro- 
mechanical punched-card data-processing 
technology, which matured and was widely 
adopted during the 1920s and 1930s. These 
techniques were the precursors of the stored 


O Fig. 1.12 The ENIAC. Early computers, such as the 
ENIAC, were the precursors of today’s personal comput- 
ers (PCs) and handheld calculators. (US Army photo. See 
also » http://www.computersciencelab.com/Computer- 
History/HistoryPt4.htm (Accessed 5/31/2019)) 


program and wholly electronic digital com- 
puters, which began to appear in the late 1940s 
(Collen 1995). 

One early activity in biomedical comput- 
ing was the attempt to construct systems that 
would assist a physician in decision making 
(see » Chap. 24). Not all biomedical- 
computing programs pursued this course, 
however. Many of the early ones instead 
investigated the notion of a total hospital 
information system (HIS; see » Chap. 16). 
These projects were perhaps less ambitious in 
that they were more concerned with practical 
applications in the short term; the difficulties 
they encountered, however, were still formi- 
dable. The earliest work on HISs in the United 
States was probably that associated with the 
MEDINET project at General Electric, fol- 
lowed by work at Bolt, Beranek, Newman in 
Cambridge, Massachusetts, and then at the 
Massachusetts General Hospital (MGH) in 
Boston. A number of hospital application 
programs were developed at MGH by Barnett 
and his associates over three decades 
beginning in the early 1960s. Work on similar 
systems was undertaken by Warner at Latter 
Day Saints (LDS) Hospital in Salt Lake City, 
Utah, by Collen at Kaiser Permanente in 
Oakland, California, by Wiederhold at 
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O Fig. 1.13 Tabulating machines. The Hollerith Tabu- 
lating Machine was an early data-processing system that 
performed automatic computation using punched cards. 
(Photograph courtesy of the Library of Congress) 
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Stanford University in Stanford, California, 
and by scientists at Lockheed in Sunnyvale, 
California.'® 

The course of HIS applications bifurcated in 
the 1970s. One approach was based on the con- 
cept of an integrated or monolithic design in 
which a single, large, time-shared computer 
would be used to support an entire collection of 
applications. An alternative was a distributed 
design that favored the separate implementation 
of specific applications on smaller individual 
computers—minicomputers—thereby permit- 
ting the independent evolution of systems in the 
respective application areas. A common 
assumption was the existence of a single shared 
database of patient information. The multi- 
machine model was not practical, however, until 
network technologies permitted rapid and reli- 
able communication among distributed and 
(sometimes) heterogeneous types of machines. 
Such distributed HISs began to appear in the 
1980s (Simborg et al. 1983). 

Biomedical-computing activity broadened 
in scope and accelerated with the appearance 
of the minicomputer in the early 1970s. These 
machines made it possible for individual 
departments or small organizational units to 
acquire their own dedicated computers and to 
develop their own application systems 
(O Fig. 1.14). In tandem with the introduc- 
tion of general-purpose software tools that 
provided standardized facilities to individuals 
with limited computer training (such as the 
UNIX operating system and programming 
environment), the minicomputer put more 
computing power in the hands of more bio- 
medical investigators than did any other sin- 
gle development until the introduction of the 
microprocessor, a central processing unit 
(CPU) contained on one or a few chips 
(B Fig. 1.15). 

Everything changed radically in the late 
1970s and early 1980s, when the microproces- 


16 The latter system was later taken over and further 
developed by the Technicon Corporation (subse- 
quently TDS Healthcare Systems Corporation). 
Later the system was part of the suite of products 
available from Eclipsys, Inc. (which in turn was 
acquired by Allscripts, Inc in 2010). 
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sor and the personal computer (PC) or micro- 
computer became available. Not only could 
hospital departments afford minicomputers 
but now individuals also could afford micro- 


O Fig. 1.14 Departmental system. Hospital depart- 
ments, such as the clinical laboratory, were able to imple- 
ment their own custom-tailored systems when affordable 
minicomputers became available. These departments 
subsequently used microcomputers to support adminis- 
trative and clinical functions. (Copyright 2013 Hewlett- 
Packard Development Company, LP. Reproduced from 
~1985 original with permission) 
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O Fig. 1.15 Miniature computer. The microprocessor, 
or “computer on a chip,” revolutionized the computer 
industry in the 1970s. By installing chips in small boxes 
and connecting them to a computer terminal, engineers 
produced the personal computer (PC)—an innovation 
that made it possible for individual users to purchase 
their own systems 
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computers. This change enormously broad- 
ened the base of computing in our society and 
gave rise to a new software industry. The first 
articles on computers in medicine had appeared 
in clinical journals in the late 1950s, but it was 
not until the late 1970s that the first use of 
computers in advertisements dealing with com- 
puters and aimed at physicians began to appear 
(O Fig. 1.16). Within a few years, a wide range 
of computer-based information-management 
tools were available as commercial products; 
their descriptions began to appear in journals 
alongside the traditional advertisements for 
drugs and other medical products. Today indi- 
vidual physicians find it practical to employ 
PCs in a variety of settings, including for appli- 
cations in patient care or clinical investigation. 

Today we enjoy a wide range of hardware 
of various sizes, types, prices, and capabilities, 
all of which will continue to evolve in the 
decades ahead. The trend—reductions in size 
and cost of computers with simultaneous 
increases in power (@ Fig. 1.17)—shows no 
sign of slowing, although scientists foresee the 


O Fig. 1.16 Medical advertising. An early advertise- 
ment for a portable computer terminal that appeared in 
general medical journals in the late 1970s. The develop- 
ment of compact, inexpensive peripheral devices and 
personal computers (PCs) inspired future experiments 
in marketing directly to clinicians (Reprinted by permis- 
sion of copyright holder Texas Instruments Incorpo- 
rated © 1985) 
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O Fig. 1.17 Moore’s Law. Former Intel chairman Gor- 
don Moore is credited with popularizing the “law” that 
the size and cost of microprocessor chips will half every 
18 months while they double in computing power. This 


graph shows the exponential growth in the number of 
transistors that can be integrated on a single microproces- 
sor chip. The trend continues to this day. (Source: Wiki- 
pedia: > https://en.wikipedia.org/wiki/Transistor_count) 
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O Fig. 1.18 The National Library of Medicine 
(NLM). The NLM, on the campus of the National 
Institutes of Health (NIH) in Bethesda, Maryland, is 
the principal biomedical library for the nation (see 
> Chap. 23). It is also a major source of support for 
research and training in biomedical informatics, both at 
NIH and in universities throughout the US. (Photo- 
graph courtesy of the National Library of Medicine) 


ultimate physical limitations to the miniatur- 
ization of computer circuits. !7 

Progress in biomedical-computing 
research will continue to be tied to the avail- 
ability of funding from either government or 
commercial sources. Because most biomedical- 
computing research is exploratory and is far 
from ready for commercial application, the 
federal government has played a key role in 
funding the work of the last four decades, 
mainly through the NIH and the Agency for 
Health Care Research and Quality (AHRQ). 
The National Library of Medicine (NLM) 
has assumed a primary role for biomedical 
informatics, especially with support for basic 
research in the field (@ Fig. 1.18). As increas- 
ing numbers of applications prove successful 
in the commercial marketplace, it is likely that 
more development work will shift to indus- 
trial settings and that university programs will 
focus increasingly on fundamental research 
problems viewed as too speculative for short- 
term commercialization — as has occurred in 
the field of computer science over the past 
several decades. 


17 > https://www.sciencedaily.com/ 
releases/2008/01/080112083626.htm; » https://arstech- 
nica.com/science/2014/08/are-processors-pushing-up- 
against-the-limits-of-physics/ (Accessed 5/27/19). 


G Fig. 1.19 Doctor of the future. By the early 1980s, 
advertisements in medical journals (such as this one for 
an antihypertensive agent) began to use computer equip- 
ment as props and even portrayed them in a positive light. 
The suggestion in this photograph seems to be that an 
up-to-date physician feels comfortable using computer- 
based tools in his practice. (Photograph courtesy of ICI 
Pharma, Division of ICI Americas, Inc) 


1.4.3 Relationship to Biomedical 
Science and Clinical Practice 


The exciting accomplishments of biomedical 
informatics, and the implied potential for future 
benefits to medicine, must be viewed in the con- 
text of our society and of the existing health care 
system. As early as 1970, an eminent clinician 
suggested that computers might in time have a 
revolutionary influence on medical care, on 
medical education, and even on the selection cri- 
teria for health-science trainees (Schwartz 1970). 
The subsequent enormous growth in computing 
activity has been met with some trepidation by 
health professionals. They ask where it will all 
end. Will health workers gradually be replaced 
by computers? Will nurses and physicians need 
to be highly trained in computer science or infor- 
matics before they can practice their professions 
effectively? Will both patients and health work- 
ers eventually revolt rather than accept a trend 
toward automation that they believe may 
threaten the traditional humanistic values in 
health care delivery (see > Chap. 12) (Shortliffe 
1993a)? Will clinicians be viewed as outmoded 
and backward if they do not turn to computa- 
tional tools for assistance with information 
management and decision making (@ Fig. 1.19)? 
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Biomedical informatics is intrinsically 
entwined with the substance of biomedical sci- 
ence. It determines and analyzes the structure of 
biomedical information and knowledge, whereas 
biomedical science is constrained by that struc- 
ture. Biomedical informatics melds the study 
data, information, knowledge, decision making, 
and supporting technologies with analyses of 
biomedical information and knowledge, thereby 


Box 1.2: The Nature of Medical Information 
This material is adapted from a small portion 
of a classic book on this topic. It was written by 
Dr. Scott Blois, who coauthored the introduc- 
tory chapter to this textbook in its 1st edition, 
which was published shortly after his death. Dr. 
Blois was a scholar who directed the informatics 
program at the University of California San 
Francisco and served as the first president of the 
American College of Medical Informatics 
(ACMI). [Blois, M. S. (1984). Information and 
medicine: The nature of medical descriptions. 
Berkeley: University of California Press]. 
From the material in this chapter, you 
might conclude that biomedical applications 
do not raise any unique problems or concerns. 
On the contrary, the biomedical environment 
raises several issues that, in interesting ways, 
are quite distinct from those encountered in 
most other domains of applied computing. 
Clinical information seems to be systemati- 
cally different from the information used in 
physics, engineering, or even clinical chemis- 
try (which more closely resembles chemical 
applications generally than it does medical 
ones). Aspects of biomedical information 
include an essence of uncertainty—we can 
never know all about a physiological pro- 
cess—and this results in inevitable variability 
among individuals. These differences raise 
special problems and some investigators sug- 
gest that biomedical computer science differs 
from conventional computer science in funda- 
mental ways. We shall explore these differ- 
ences only briefly here; for details, you can 


addressing specifically the interface between the 
science of information and knowledge manage- 
ment and biomedical science. To illustrate what 
we mean by the “structural” features of biomed- 
ical information and knowledge, we can con- 
trast the properties of the information and 
knowledge typical of such fields as physics or 
engineering with the properties of those typical 
of biomedicine (see > Box 1.2). 


consult Blois’ book on this subject (see Sug- 
gested Readings). 

Let us examine an instance of what we will 
call a low-level (or readily formalized) science. 
Physics is a natural starting point; in any dis- 
cussion of the hierarchical relationships among 
the sciences (from the fourth-century BC Greek 
philosopher Aristotle to the twentieth-century 
U.S. librarian Melvil Dewey), physics will be 
placed near the bottom. Physics characteristi- 
cally has a certain kind of simplicity, or gener- 
ality. The concepts and descriptions of the 
objects and processes of physics, however, are 
necessarily used in all applied fields, including 
medicine. The laws of physics and the descrip- 
tions of certain kinds of physical processes are 
essential in representing or explaining func- 
tions that we regard as medical in nature. We 
need to know something about molecular phys- 
ics, for example, to understand why water is 
such a good solvent; to explain how nutrient 
molecules are metabolized, we talk about the 
role of electron-transfer reactions. 

Applying a computer (or any formal com- 
putation) to a physical problem in a medical 
context is no different from doing so in a phys- 
ics laboratory or for an engineering applica- 
tion. The use of computers in various low-level 
processes (such as those of physics or chemis- 
try) is similar and is independent of the appli- 
cation. If we are talking about the solvent 
properties of water, it makes no difference 
whether we happen to be working in geology, 
engineering, or medicine. Such low-level pro- 
cesses of physics are particularly receptive to 
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mathematical treatment, so using computers 
for these applications requires only conven- 
tional numerical programming. 

In biomedicine, however, there are other 
higher-level processes carried out in more com- 
plex objects such as organisms (one type of 
which is patients). Many of the important 
informational processes are of this kind. 
When we discuss, describe, or record the prop- 
erties or behavior of human beings, we are 
using the descriptions of very high-level 
objects, the behavior of whom has no counter- 
part in physics or in engineering. The person 
using computers to analyze the descriptions 
of these high-level objects and processes 
encounters serious difficulties (Blois 1984). 

One might object to this line of argument 
by remarking that, after all, computers are 
used routinely in commercial applications in 
which human beings and situations concern- 
ing them are involved and that relevant com- 
putations are carried out successfully. The 
explanation is that, in these commercial 
applications, the descriptions of human 
beings and their activities have been so highly 
abstracted that the events or processes have 
been reduced to low-level objects. In biomed- 
icine, abstractions carried to this degree 
would be worthless from either a clinical or 
research perspective. 

For example, one instance of a human 
being in the banking business is the customer, 
who may deposit, borrow, withdraw, or invest 
money. To describe commercial activities such 
as these, we need only a few properties; the 
customer can remain an abstract entity. In 
clinical medicine, however, we could not begin 
to deal with a patient represented with such 
skimpy abstractions. We must be prepared to 
analyze most of the complex behaviors that 
human beings display and to describe patients 
as completely as possible. We must deal with 
the rich descriptions occurring at high levels 
in the hierarchy, and we may be hard pressed 
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to encode and process this information using 
the tools of mathematics and computer sci- 
ence that work so well at low levels. In light of 
these remarks, the general enterprise known 
as artificial intelligence (AI) can be aptly 
described as the application of computer sci- 
ence to high-level, real-world problems. 
Biomedical informatics thus includes com- 
puter applications that range from processing 
of very low-level descriptions, which are little 
different from their counterparts in physics, 
chemistry, or engineering, to processing of 
extremely high-level ones, which are com- 
pletely and systematically different. When we 
study human beings in their entirety (includ- 
ing such aspects as human cognition, self-con- 
sciousness, intentionality, and behavior), we 
must use these high-level descriptions. We will 
find that they raise complex issues to which 
conventional logic and mathematics are less 
readily applicable. In general, the attributes of 
low-level objects appear sharp, crisp, and 
unambiguous (e.g., “length,” “mass”), whereas 
those of high-level ones tend to be soft, fuzzy, 
and inexact (e.g., “unpleasant scent,” “good”). 
Just as we need to develop different meth- 
ods to describe high-level objects, the infer- 
ence methods we use with such objects may 
differ from those we use with low-level ones. 
In formal logic, we begin with the assumption 
that a given proposition must be either true or 
false. This feature is essential because logic is 
concerned with the preservation of truth value 
under various formal transformations. It is 
difficult or impossible, however, to assume 
that all propositions have truth values when 
we deal with the many high-level descriptions 
in medicine or, indeed, in everyday situations. 
Such questions as “Was Woodrow Wilson a 
good president?” cannot be answered with a 
“yes” or “no” (unless we limit the question to 
specific criteria for determining the goodness 
of presidents). Many common questions in 
biomedicine have the same property. 
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O Fig. 1.20 Biomedical informatics as basic science. 
We view the term biomedical informatics as referring to 
the basic science discipline in which the development 
and evaluation of new methods and theories are a pri- 
mary focus of activity. These core concepts and meth- 
ods in turn have broad applicability in the health and 
biomedical sciences. The informatics subfields indicated 
by the terms across the bottom of this figure are accord- 
ingly best viewed as application domains for a common 


Biomedical informatics is perhaps best 
viewed as a basic biomedical science, with a 
wide variety of potential areas of application 
(O Fig. 1.20). The analogy with other basic sci- 
ences is that biomedical informatics uses the 
results of past experience to understand, struc- 
ture, and encode objective and subjective bio- 
medical findings and thus to make them 
suitable for processing. This approach supports 
the integration of the findings and their analy- 
ses. In turn, the selective distribution of newly 
created knowledge can aid patient care, health 
planning, and basic biomedical research. 

Biomedical informatics is, by its nature, an 
experimental science, characterized by posing 
questions, designing experiments, performing 
analyses, and using the information gained to 
design new experiments. One goal is simply to 
search for new knowledge, called basic 
research. A second goal is to use this knowl- 
edge for practical ends, called applications 
(applied) research. There is a continuity 
between these two endeavors (see @ Fig. 1.20). 
In biomedical informatics, there is an espe- 
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informatics 


Clinical 
informatics 


Imaging 


set of concepts and techniques from the field of bio- 
medical informatics. Note that work in biomedical 
informatics is motivated totally by the application 
domains that the field is intended to serve (thus the two- 
headed arrows in the diagram). Therefore the basic 
research activities in the field generally result from the 
identification of a problem in the real world of health 
or biomedicine for which an informatics solution is 
sought (see text) 


cially tight coupling between the application 
areas, broad categories of which are indicated 
at the bottom of @ Fig. 1.20, and the identifi- 
cation of basic research tasks that character- 
ize the scientific underpinnings of the field. 
Research, however, has shown that there can 
be a very long period of time between the 
development of new concepts and methods in 
basic research and their eventual application 
in the biomedical world (Balas and Boren 
2000). Furthermore (see @ Fig. 1.21), many 
discoveries are discarded along the way, leav- 
ing only a small percentage of basic research 
discoveries that have a practical influence on 
the health and care of patients. 

Work in biomedical informatics (BMI) is 
inherently motivated by problems encoun- 
tered in a set of applied domains in biomedi- 
cine. The first of these historically has been 
clinical care (including medicine, nursing, 
dentistry, and veterinary care), an area of 
activity that demands patient-oriented infor- 
matics applications. We refer to this area as 
clinical informatics.! It includes several sub- 
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O Fig. 1.21 Phases in the transfer of research into 
clinical practice. A synthesis of studies focusing on vari- 
ous phases of this transfer has indicated that it takes an 
average of 17 years to make innovation part of routine 
care (Balas and Boren 2000). Pioneering institutions 
often apply innovations much sooner, sometimes within 
a few weeks, but nationwide introduction is usually slow. 


topics and areas of specialized expertise, 
including patient-care foci such as nursing 
informatics, dental informatics, and even vet- 
erinary informatics. Furthermore, the former 
name of the discipline, medical informatics, is 
now reserved for those applied research and 
practice topics that focus on disease and the 
role of physicians. As was previously dis- 
cussed, the term “medical informatics” is no 
longer used to refer to the discipline as a 
whole. 

Closely tied to clinical informatics is public 
health informatics (@ Fig. 1.20), where simi- 


18 Clinical informatics was approved in 2013 by the 
American Board of Medical Specialties as a for- 
mal subspecialty of medicine (Finnell and Dixon, 
2015), with board certification examinations offered 
for eligible candidates by the American Board of 
Preventive Medicine (> https://www.theabpm.org/ 
become-certified/subspecialties/clinical-informatics/ 
(Accessed 6/1/19)). AMIA is formulating a similar 
certification program, AMIA Health Informatics 
Certification (AHIC) for non-physicians who are 
working in the clinical informatics area (» https:// 
www.amia.org/ahic, Accessed 1/5/2020). 
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National utilization rates of specific, well-substantiated 
procedures also suggest a delay of two decades in reach- 
ing the majority of eligible patients. For a well-docu- 
mented study of such delays and their impact in an 
important area of clinical medicine, see (Krumholz 
et al. 1998). (Figure courtesy of Dr. Andrew Balas, used 
with permission) 


lar methods are generalized for application to 
populations of patients rather than to single 
individuals (see >» Chap. 18). Thus clinical 
informatics and public health informatics 
share many of the same methods and tech- 
niques. The closeness of their relationship was 
amply demonstrated by the explosion in infor- 
matics research and applications that occurred 
in response to the COVID-19 pandemic.!? By 
mid-2020, several articles had appeared to 
demonstrate the tight relationship between 
EHRs and public health informatics for man- 
agement of the outbreak (Reeves et al. 2020). 

Two other large areas of application overlap 
in some ways with clinical informatics and pub- 
lic health informatics. These include imaging 
informatics (and the set of issues developed 
around both radiology and other image man- 
agement and image analysis domains such as 
pathology, dermatology, and molecular visual- 
ization—see > Chaps. 10 and 22). Finally, there 


19 » https://www.amia.org/COVID19 
05/03/2020) 
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O Fig. 1.22 Building on the concepts of © Fig. 1.20, 
this diagram demonstrates the breadth of the biomedi- 
cal informatics field. The relationship between biomedi- 
cal informatics as a core scientific discipline and its 
diverse array of application domains that span biologi- 
cal science, imaging, clinical practice, public health, and 


is the burgeoning area of bioinformatics, which 
at the molecular and cellular levels is offering 
challenges that draw on many of the same infor- 
matics methods as well (see » Chaps. 9 and 26). 
As is shown in @ Fig. 1.22, there is a spec- 
trum as one moves from left to right across 
these BMI application domains. In bioinfor- 
matics, workers deal with molecular and cel- 
lular processes in the application of 
informatics methods. At the next level, work- 
ers focus on tissues and organs, which tend to 
be the emphasis of imaging informatics work 
(also called structural informatics by some 
investigators). Progressing to clinical infor- 
matics, the focus is on individual patients, and 
finally to public health, where researchers 
address problems of populations and of soci- 
ety, including prevention. The core science of 
biomedical informatics has important contri- 
butions to make across that entire spectrum, 
and many informatics methods are broadly 
applicable across the full range of domains. 
Note from @ Fig. 1.20 that biomedical 
informatics and bioinformatics are not syn- 
onyms and it is incorrect to refer to the scientific 
discipline as bioinformatics, which is, rather, an 
important area of application of BMI methods 


others not illustrated (see text). Note that “health infor- 
matics” is the term used to refer to applied research and 
practice in clinical and public health informatics. It is 
not a synonym for the underlying discipline, which is 
“biomedical informatics” 


and concepts. Similarly, the term health infor- 
matics, which refers to applied research and 
practice in clinical and public-health informat- 
ics, is also not an appropriate name for the core 
discipline, since BMI is applicable to basic 
human biology as well as to health. 

We acknowledge that the four major 
areas of application shown in B Fig. 1.19 
have “fuzzy” boundaries, and many areas of 
applied informatics research involve more 
than one of the categories. For example, bio- 
molecular imaging involves both bioinfor- 
matics and imaging informatics concepts. 
Similarly, personal or consumer health infor- 
matics (see ® Chap. 11) includes elements of 
both clinical informatics and public-health 
informatics. Another important area of BMI 
research activities is pharmacogenomics (see 
> Chap. 27), which is the effort to infer genetic 
determinants of human drug response. Such 
work requires the analysis of linked genotypic 
and phenotypic databases, and therefore lies at 
the intersection of bioinformatics and clinical 
informatics. Similarly, » Chap. 28 presents 
the role of informatics in precision medicine, 
which relies heavily on both bioinformatics 
and clinical informatics concepts and systems. 
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O Fig. 1.23 A Venn diagram that depicts the relation- 
ships among the three major disciplines: biological 
research, clinical medicine / public health, and biomedi- 
cal informatics. Bioinformatics, Health Informatics, and 
Translational Research lie at the intersections among 
pairs of these fields as shown. Precision Medicine, which 


Precision medicine is a product of the 
increasing emphasis on moving both data 
and concepts from basic science research 
into clinical science and ultimately into prac- 
tice. Such efforts are typically character- 
ized as translational science—a topic that 
has attracted major investments by the US 
National Institutes of Health (NIH) over 
the past two decades. Informatics scientists 
are engaged as collaborators in this transla- 
tional work, which spans all four major cat- 
egories of application shown in B Fig. 1.20, 
pursuing work in translational bioinformatics 
(> Chap. 26) and clinical research informat- 
ics (> Chap. 27).2° Accordingly, informat- 
ics was defined as a major component of the 
Clinical and Translational Science Awards 
(CTSA) Program,?! support by the National 
Center for Advancing Translational Sciences 
(NCATS) at the NIH. AMIA sponsors 
an annual weeklong conference, known as 
the Informatics Summit, that presents new 


20 See also the diagram in (Kulikowski et al. 2012), 
which shows how these two disciplines span all areas 
of applied biomedical informatics. 

21 » https://ncats.nih.gov/ctsa (Accessed 6/2/2019). 


relies on Translational Bioinformatics and Clinical 
Research Informatics, constitutes the area of common 
overlap among all three Venn circles. (Adapted with per- 
mission from a diagram developed by the Department 
of Biomedical Informatics at the Vanderbilt Medical 
Center, Nashville, TN) 


research results and applications in these 
areas.”” The interactions among bioscience, 
clinical science, and informatics can be nicely 
captured by recognizing how informatics 
fields and translational science relate to one 
another (@ Fig. 1.23). 

In general, BMI researchers derive their 
inspiration from one or two, rather than all, 
of the application areas, identifying funda- 
mental methodologic issues that need to be 
addressed and testing them in system proto- 
types or, for more mature methods, in actual 
systems that are used in clinical or biomedical 
research settings. One important implication 
of this viewpoint is that the core discipline is 
identical, regardless of the area of application 
that a given individual is motivated to address, 
although some BMI methods have greater rel- 
evance to some domains than to others. This 
argues for unified BMI educational programs, 
ones that bring together students with a wide 
variety of application interests. Elective 
courses and internships in areas of specific 


22 » https://www.amia.org/meetings-and-events 
(Accessed 6/2/2019) 
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interest are of course important complements 
to the core exposures that students should 
receive (Kulikowski et al. 2012), but, given the 
need for teamwork and understanding in the 
field, separating trainees based on the applica- 
tion areas that may interest them would be 
counterproductive and wasteful.” 

The scientific contributions of BMI also 
can be appreciated through their potential for 
benefiting the education of health profession- 
als (Shortliffe 2010). For example, in the edu- 
cation of medical students, the various 
cognitive activities of physicians traditionally 
have tended to be considered separately and in 
isolation—they have been largely treated as 
though they are independent and distinct 
modules of performance. One activity fre- 
quently emphasized is formal education 
regarding medical decision making (see 
» Chap. 3). The specific content of this area 
continues to evolve, but the discipline’s depen- 
dence on formal methods regarding the use of 
knowledge and information reveal that it is 
one aspect of biomedical informatics. 

A particular topic in the study of medical 
decision making is diagnosis, which is often 
conceived and taught as though it were a free- 
standing and independent activity. Medical 
students may thus be led to view diagnosis as 
a process that physicians carry out in isolation 
before choosing therapy for a patient or pro- 
ceeding to other modular tasks. A number of 
studies have shown that this model is oversim- 
plified and that such a decomposition of cog- 
nitive tasks may be quite misleading (Elstein 
et al. 1978a; Patel and Groen 1986). Physicians 
seem to deal with several tasks at the same 


23 Many current biomedical informatics training pro- 
grams were designed with this perspective in mind. 
Students with interests in clinical, imaging, public 
health, and biologic applications are often trained 
together and are required to learn something about 
each of the other application areas, even while spe- 
cializing in one subarea for their own research. Sev- 
eral such programs were described in a series of 
articles in the Journal of Biomedical Informatics in 
2007 (Tarczy-Hornoch et al. 2007) and many more 
have been added since that time. 


time. Although a diagnosis may be one of the 
first things physicians think about when they 
see a new patient, patient assessment (diagno- 
sis, Management, analysis of treatment results, 
monitoring of disease progression, etc.) is a 
process that never really terminates. A physi- 
cian must be flexible and open-minded. It is 
generally appropriate to alter the original 
diagnosis if it turns out that treatment based 
on it is unsuccessful or if new information 
weakens the evidence supporting the diagno- 
sis or suggests a second and concurrent disor- 
der. » Chapter 4 discusses these issues in 
greater detail. 

When we speak of making a diagnosis, 
choosing a treatment, managing therapy, 
making decisions, monitoring a patient, or 
preventing disease, we are using labels for dif- 
ferent aspects of medical care, an entity that 
has overall unity. The fabric of medical care is 
a continuum in which these elements are 
tightly interwoven. Regardless of whether we 
view computer and information science as a 
profession, a technology, or a science, there is 
no doubt about its importance to biomedi- 
cine. We can assume computers are here to 
stay as fundamental tools to be used in clini- 
cal practice, biomedical research, and health 


science education. 


1.4.4 Relationship to Computer 
Science 


During its evolution as an academic entity in 
universities, computer science followed an 
unsettled course as involved faculty attempted 
to identify key topics in the field and to find 
the discipline’s organizational place. Many 
computer science programs were located in 
departments of electrical engineering, because 
major concerns of their researchers were com- 
puter architecture and design and the devel- 
opment of practical hardware components. 
At the same time, computer scientists were 
interested in programming languages and 
software, undertakings not particularly char- 
acteristic of engineering. Furthermore, their 
work with algorithm design, computability 
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theory,”* and other theoretical topics seemed 
more related to mathematics. 

Biomedical informatics draws from all of 
these activities—development of hardware, 
software, and computer science theory. 
Biomedical computing generally has not had 
a large enough market to influence the course 
of major hardware developments; i.e., com- 
puters serve general purposes and have not 
been developed specifically for biomedical 
applications. Not since the early 1960s (when 
health-computing experts occasionally talked 
about and, in a few instances, developed spe- 
cial medical terminals) have people assumed 
that biomedical applications would use hard- 
ware other than that designed for general use. 

The question of whether biomedical appli- 
cations would require specialized program- 
ming languages might have been answered 
affirmatively in the 1970s by anyone examin- 
ing the MGH Utility Multi-Programming 
System, known as the MUMPS language 
(Greenes et al. 1970; Bowie and Barnett 1976), 
which was specially developed for use in med- 
ical applications. For several years, MUMPS 
was the most widely used language for medi- 
cal record processing. Under its subsequent 
name, M, it is still in widespread use and has 
been used to develop commercial electronic 
health record systems. New implementations 
have been developed for each generation of 
computers. M, however, like any program- 
ming language, is not equally useful for all 
computing tasks. In addition, the software 
requirements of medicine are better under- 
stood and no longer appear to be unique; 
rather, they are specific to the kind of task. A 
program for scientific computation looks 
pretty much the same whether it is designed 
for chemical engineering or for pharmacoki- 
netic calculations. 

How, then, does BMI differ from biomedi- 
cal computer science? Is the new discipline 


24 Many interesting problems cannot be computed in a 
reasonable time and require heuristics. Computabil- 
ity theory is the foundation for assessing the feasi- 
bility and cost of computation to provide the 
complete and correct results to a formally stated 
problem. 
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simply the study of computer science with a 
“biomedical flavor”? If you return to the defi- 
nition of biomedical informatics that we pro- 
vided in » Box 1.1, and then refer to 
O Fig. 1.20, you will begin to see why bio- 
medical informatics is more than simply the 
biomedical application of computer science.” 
The issues that it addresses not only have 
broad relevance to health, medicine, and biol- 
ogy, but the underlying sciences on which 
BMI professionals draw are inherently inter- 
disciplinary as well (and are not limited to 
computer science topics). Thus, for example, 
successful BMI research will often draw on, 
and contribute to, computer science, but it 
may also be closely related to the decision sci- 
ences (probability theory, decision analysis, or 
the psychology of human problem solving), 
cognitive science, information sciences, or the 
management sciences (B Fig. 1.24). 
Furthermore, a biomedical informatics 
researcher will be tightly linked to some 
underlying problem from the real world of 
health or biomedicine. As @ Fig. 1.24 illus- 
trates, for example, a biomedical informatics 
basic researcher or doctoral student will typi- 
cally be motivated by one of the application 
areas, such as those shown at the bottom of 
O Fig. 1.22, but a dissertation worthy of a 
PhD in the field will usually be identified by a 
generalizable scientific result that also con- 
tributes to one of the component disciplines 
(0 Fig. 1.20) and on which other scientists 
can build in the future. 


25 In fact, the multidisciplinary nature of biomedical 
informatics has led the informatics term to be bor- 
rowed in other disciplines, including computer sci- 
ence organizations, even though the English name 
for the field was first adopted in the biomedical con- 
text. Today we even have generic full departments of 
informatics in the US (e.g., see » https://informat- 
ies.njit.edu, Accessed 11/28/2020) and in other parts 
of the world as well (e.g., > http://www.sussex.ac. 
uk/informatics/. Accessed 1/5/2020). In the US, 
there are full schools with informatics in their title 
(e.8., > https://luddy.indiana.edu/index.html. 
Accessed 1/5/2020) and even a School of Biomedical 
Informatics (» https://sbmi.uth.edu/. Accessed 
1/2/2020). 
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O Fig. 1.24 Component sciences in biomedical infor- 
matics. An informatics application area is motivated by 
the needs of its associated biomedical domain, to which 
it attempts to contribute solutions to problems. Thus 
any applied informatics work draws upon a biomedical 
domain for its inspiration, and in turn often leads to the 
delineation of basic research challenges in biomedical 
informatics that must be tackled if the applied biomedi- 


1.4.5 Relationship to Biomedical 
Engineering 


BMI is a relatively young discipline, whereas 
biomedical engineering (BME) is older and 
well-established. Many engineering and medi- 
cal schools have formal academic programs in 
BME, often with departmental status and 
full-time faculty. Only in the last two or three 
decades has this begun to be true of biomedi- 
cal informatics academic units. How does bio- 
medical informatics relate to biomedical 
engineering, especially in an era when engi- 
neering and computer science are increasingly 
intertwined? 

Biomedical engineering departments 
emerged in the late 1960s, when technology 
began to play an increasingly prominent role 


er 
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Clinical or 
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cal domain is ultimately to benefit. At the methodologic 
level, biomedical informatics draws on, and contributes 
to, a wide variety of component disciplines, of which 
computer science is only one. As B Figs. 1.20 and 1.22 
show explicitly, biomedical informatics is inherently 
multidisciplinary, both in its areas of application and in 
the component sciences on which it draws 


in medical practice.” The emphasis in such 
departments has tended to be research on, 
and development of, instrumentation (e.g., as 
discussed in > Chaps. 21 and 22, advanced 
monitoring systems, specialized transducers 
for clinical or laboratory use, and imaging 
methods and enhancement techniques for use 
in radiology), with an orientation toward the 


26 By the late 1960s the first BME departments were 
formed in the US at the University of Virginia, Case 
Western Reserve University, Johns Hopkins Univer- 
sity, and Duke University (see $ https://navigate. 
aimbe.org/why-bioengineering/history/, | Accessed 
6/2/2019). Duke’s undergraduate degree program in 
BMI was the first to be accredited by the Engineer- 
ing Council for Professional Development (Septem- 
ber 1972). 
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development of medical devices, prostheses, 
and specialized research tools. There is also a 
major emphasis on tissue engineering and 
related wet-bench research efforts. In recent 
years, computing techniques have been used 
both in the design and construction of medi- 
cal devices and in the medical devices them- 
selves. For example, the “smart” devices 
increasingly found in most medical specialties 
are all dependent on computational technol- 
ogy. Intensive care monitors that generate 
blood pressure records while calculating mean 
values and hourly summaries are examples of 
such “intelligent” devices. 

The overlap between biomedical engineer- 
ing and BMI suggests that it would be unwise 
for us to draw compulsively strict boundaries 
between the two fields. There are ample 
opportunities for interaction, and there are 
chapters in this book that clearly overlap with 
biomedical engineering topics—e.g., ® Chap. 
21 on patient-monitoring systems and 
> Chap. 22 on radiology systems. Even where 
they meet, however, the fields have differences 
in emphasis that can help you to understand 
their different evolutionary histories. In bio- 
medical engineering, the emphasis is on medi- 
cal devices and underlying methods; in BMI, 
the emphasis is on biomedical information 
and knowledge and on their management 
with the use of computers. In both fields, the 
computer is secondary, although both use 
computing technology. The emphasis in this 
book is on the informatics end of the spec- 
trum of biomedical computer science, so we 
shall not spend much time examining biomed- 
ical engineering topics. 


1.5 Integrating Biomedical 
Informatics and Clinical 
Practice 


It should be clear from the material in this 
chapter that biomedical informatics is a 
remarkably broad and complex topic. We 
have argued that information management is 
intrinsic to both life-science research and clin- 
ical practice and that, in biomedical settings 
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over a half century, the use of computers to 
aid in information management has grown 
from a futuristic notion to an everyday occur- 
rence. In fact, the EHR and other information 
technology tools may now be the only kind of 
equipment that is used by every single health 
care professional, regardless of specialty or 
professional title. In this chapter and through- 
out the book, we emphasize the myriad ways 
in which computers are used in biomedicine 
to ease the burdens of information manage- 
ment and the means by which new technology 
is changing the delivery of health care. The 
degree to which such changes are positively 
realized, and their rate of occurrence, are 
being determined in part by external forces 
that influence the costs of developing and 
implementing biomedical applications and 
the ability of scientists, clinicians, patients, 
and the health care system to accrue the 
potential benefits. 

We can summarize several global forces 
that are affecting biomedical computing and 
that will continue to influence the extent to 
which computers are assimilated into clinical 
practice: (1) new developments in communi- 
cations plus computer hardware and software; 
(2) a further increase in the number of indi- 
viduals who have been trained in both medi- 
cine, or another health profession, and in 
BMI; and (3) ongoing changes in health care 
financing designed to control the rate of 
growth of health-related expenditures. 

We touched on the first of these factors in 
> Sect. 1.4.2, when we described the histori- 
cal development of biomedical computing 
and the trend from mainframe computers, to 
microcomputers and PCs, and to the mobile 
devices of today. The future view outlined in 
> Sect. 1.1 similarly builds on the influence 
that the Internet has provided throughout 
society during the last decade. Hardware 
improvements have made powerful computers 
inexpensive and thus available to hospitals, to 
departments within hospitals, and even to 
individual physicians. The broad selection of 
computers of all sizes, prices, and capabilities 
makes computer applications both attractive 
and accessible. Technological advances in 
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information storage devices,” including the 
movement of files to the “cloud”, are facilitat- 
ing the inexpensive storage of large amounts 
of data, thus improving the feasibility of data- 
intensive applications, such as drawing infer- 
ences from human genome datasets (see 
> Chaps. 9, 26, and 28) and the all-digital 
radiology department (» Chap. 22). 
Standardization of hardware and advances in 
network technology are making it easier to 
share data and to integrate related 
information-management functions within a 
hospital or other health care organization, 
although inadequacies in standards for encod- 
ing and sharing data continue to be challeng- 
ing (> Chaps. 7, 14, 15, and 16). 

The second factor is the frustratingly slow 
increase in the number of professionals who 
are being trained to understand the biomedical 
issues as well as the technical and engineering 
ones. Computer scientists who understand bio- 
medicine are better able to design systems 
responsive to actual needs and sensitive to 
workflow and the clinical culture. Health pro- 
fessionals who receive formal training in BMI 
are likely to build systems using well-estab- 
lished techniques while avoiding the past mis- 
takes of other developers. As more professionals 
are trained in the special aspects of both fields, 
and as the programs they develop are intro- 
duced, health care professionals are more likely 
to have useful and usable systems available 
when they turn to the computer for help with 
information management tasks. 

The third factor affecting the integration 
of computing technologies into health care 
settings is our evolving health care system and 
the increasing pressure to control medical 
spending. The escalating tendency to apply 
technology to all patient-care tasks is a fre- 
quently cited phenomenon in modern medical 
practice. Mere physical findings no longer are 


27 Technological progress in this area is occurring at a 
dizzying rate. Consider, for example, the announce- 
ment that scientists are advancing the notion of 
using DNA for data storage and can store as much 
as 704 terabytes of information in a gram of DNA. 
> http://www.engadget.com/2012/08/19/harvard- 
stores-704tb-in-a-gram-of-dna; » https://homes. 
cs.washington.edu/-bornholt/dnastorage-asplos16/ 
(Accessed 5/30/19). 


considered adequate for making diagnoses 
and planning treatments. In fact, medical stu- 
dents who are taught by more experienced 
physicians to find subtle diagnostic signs by 
examining various parts of the body nonethe- 
less often choose to bypass or deemphasize 
physical examinations in favor of ordering 
one test after another. Sometimes, they do so 
without paying sufficient attention to the 
ensuing cost. Some new technologies replace 
less expensive, but technologically inferior, 
tests. In such cases, the use of the more expen- 
sive approach is generally justified. 
Occasionally, computer-related technologies 
have allowed us to perform tasks that previ- 
ously were not possible. For example, the 
scans produced with computed tomography 
or magnetic resonance imaging (see » Chaps. 
10 and 22) have allowed physicians to visual- 
ize cross-sectional slices of the body, and 
medical instruments in intensive care units 
perform continuous monitoring of patients’ 
body functions that previously could be 
checked only episodically (see >» Chap. 21). 
The development of expensive new tech- 
nologies, and the belief that more technology 
is better, have helped to fuel rapidly escalating 
health care costs. In the 1970s and 1980s, such 
rising costs led to the introduction of man- 
aged care and capitation—changes in financ- 
ing and delivery that were designed to curb 
spending. Today we are seeing a trend toward 
value-based reimbursement, which is predi- 
cated on the notion that payment for care of 
patients should be based on the demonstrated 
value received (as defined by high quality at 
low cost) rather than simply the existence of 
an encounter or procedure. Integrated com- 
puter systems can provide the means to cap- 
ture data to help assess such value, while they 
also support detailed cost accounting, the 
analysis of the relationship of costs of care to 
the benefits of that care, evaluation of the 
quality of care provided, and identification of 
areas of inefficiency. Systems that improve the 
quality of care while reducing the cost of pro- 
viding that care clearly will be favored. The 
effect of cost containment pressures on tech- 
nologies that increase the cost of care while 
improving the quality are less clear. Medical 
technologies, including computers, will be 
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embraced only if they improve the delivery of 
clinical care while either reducing costs or pro- 
viding benefits that clearly exceed their costs. 

Designers of medical systems must address 
satisfactorily many logistical and engineering 
questions before innovative solutions are inte- 
grated optimally into medical practice. For 
example, are the machines conveniently 
located? Should mobile devices further replace 
tethered workstations? Can users complete 
their tasks without excessive delays? Is the sys- 
tem reliable enough to avoid loss of data? Can 
users interact easily and intuitively with the 
computer? Does it facilitate rather than dis- 
rupt workflow? Are patient data secure and 
appropriately protected from prying eyes? In 
addition, cost-control pressures produce a 
growing reluctance to embrace expensive tech- 
nologies that add to the high cost of health 
care. The net effect of these opposing trends is 
in large part determining the degree to which 
specific systems are embraced and effectively 
implemented in the health care environment. 

In summary, rapid advances in communi- 
cations, computer hardware, and software, 
coupled with an increasing computer literacy 
of health care professionals and researchers, 
favor the implementation of effective com- 
puter applications in clinical practice, public 
health, and life sciences research. Furthermore, 
in the increasingly competitive health care 
industry, providers have a greater need for the 
information management capabilities sup- 
plied by computer systems. The challenge is to 
demonstrate in persuasive and rigorous ways 
the financial and clinical advantages of these 
systems (see > Chap. 13). 


© Suggested Readings 

Blois, M. S. (1984b). Information and medicine: 
The nature of medical descriptions. Berkeley: 
University of California Press. In this classic 
volume, the author analyzes the structure of 
medical knowledge in terms of a hierarchical 
model of information. He explores the ideas 
of high- and low-level sciences and suggests 
that the nature of medical descriptions 
accounts for difficulties in applying comput- 
ing technology to medicine. A brief summary 
of key elements in this book is included as Box 
1.2 in this chapter. 


41 


Coiera, E. (2015). Guide to health informatics (3rd 
ed.). Boca Raton, FL: CRC Press. This intro- 
ductory text is a readable summary of clinical 
and public health informatics, aimed at mak- 
ing the domain accessible and understandable 
to the non-specialist. 

Collen, M. F., & Ball, M. J. (Eds.). (2015). A his- 
tory of medical informatics in the United States 
(2nd ed.). London: Springer. This comprehen- 
sive book traces the history of the field of 
medical informatics, and identifies the origins 
of the discipline’s name (which first appeared 
in the English-language literature in 1974). 
The original (1995) edition was being updated 
by Dr. Collen when he passed away shortly 
after his 100th birthday. Dr. Ball organized an 
effort to complete the 2nd edition, enlisting 
participation by many leaders in the field. 

Elstein, A. S., Shulman, L. S., & Sprafka, S. A. 
(1978b). Medical problem solving: An analysis 
of clinical reasoning. Cambridge, MA: Harvard 
University Press. This classic collection of 
papers describes detailed studies that have illu- 
minated several aspects of the ways in which 
expert and novice physicians solve medical 
problems. The seminal work described remains 
highly relevant to today’s work on problem 
solving and clinical decision support systems. 

Friedman, C. P., Altman, R. B., Kohane, I. S., 
McCormick, K. A., Miller, P. L., Ozbolt, 
J. G., Shortliffe, E. H., Stormo, G. D., 
Szczepaniak, M. C., Tuck, D., & Williamson, 
J. (2004). Training the next generation of 
informaticians: The impact of BISTI and bio- 
informatics. Journal of American Medical 
Informatics Association, 11, 167-172. This 
important analysis addresses the changing 
nature of biomedical informatics due to the 
revolution in bioinformatics and computa- 
tional biology. Implications for training, as 
well as organization of academic groups and 
curriculum development, are discussed. 

Hoyt, R. E., & Hersh. W. R. (2018). Health infor- 
matics: Practical guide (Tth ed). Raleigh: Lulu. 
com. This introductory volume provides a 
broad view of informatics and is aimed espe- 
cially at health professionals in management 
roles or IT professionals who are entering the 
clinical world. 

Institute of Medicine”. (1991 [revised 1997]). The 
computer-based patient record: An essential 


42 


E. H. Shortliffe and M. F. Chiang 


technology for health care. Washington, DC: 
National Academy Press. National Research 
Council (1997). For The Record: Protecting 
Electronic Health Information. Washington, 
DC: National Academy Press. National 
Research Council (2000). Networking Health: 
Prescriptions for the Internet. Washington, 
DC: National Academy Press. This set of three 
reports from branches of the US National 
Academies of Science has had a major influ- 
ence on health information technology educa- 
tion and policy over the last 25 years. 


Institute of Medicine25. (2000). To err is human: 


Building a safer health system. Washington, 
DC: National Academy Press. Institute of 
Medicine (2001). Crossing the Quality Chasm: 
A New Health Systems for the 21st Century. 
Washington, DC: National Academy Press. 
Institute of Medicine (2004). Patient Safety: 
Achieving a New Standard for Care. 
Washington, DC: National Academy Press. 
This series of three reports from the Institute 
of Medicine has outlined the crucial link 
between heightened use of information tech- 
nology and the enhancement of quality and 
reduction in errors in clinical practice. Major 
programs in patient safety have resulted from 
these reports, and they have provided motiva- 
tion for a heightened interest in health care 
information technology among policy makers, 
provider organizations, and even patients. 


Kalet, I. J. (2013). Principles of biomedical infor- 


matics (2nd ed.). New York: Academic. This 
volume provides a technical introduction to 
the core methods in BMI, dealing with stor- 
age, retrieval, display, and use of biomedical 
data for biological problem solving and medi- 
cal decision making. Application examples are 
drawn from bioinformatics, clinical informat- 
ics, and public health informatics. 


National Academy of Medicine. (2018). Procuring 


interoperability: Achieving high-quality, con- 
nected, and person-centered care. Washington, 
DC: National Academy Press. National 
Academy of Medicine (2019). Artificial 
Intelligence in Health Care: The Hope, the 
Hype, the Promise, the Peril. Washington, 
DC: National Academy Press. This series of 
two reports from the National Academy of 
Medicine outlines emerging issues in biomedi- 
cal informatics: interoperability (which is dis- 


cussed in greater detail in Chapter 8), and 
artificial intelligence (which is discussed in in 
many chapters throughout this volume). 


National Academy of Medicine. (2019). Taking 


action against clinician burnout: A systems 
approach to professional well-being. Washington, 
DC: National Academy Press. This consensus 
study from the National Academy of Medicine 
discusses the problem of clinician burnout in 
the United States, including areas where health 
care information technology may contribute or 
reduce these problems. 


Shortliffe, E. (1993b). Doctors, patients, and com- 


puters: Will information technology dehuman- 
ize health care delivery? Proceedings of the 
American Philosophical Society, 137(3), 390- 
398 In this paper, the author examines the fre- 
quently expressed concern that the introduction 
of computing technology into health care set- 
tings will disrupt the development of rapport 
between clinicians and patients and thereby 
dehumanize the therapeutic process. He argues, 
rather, that computers may eventually have pre- 
cisely the opposite effect on the relationship 
between clinicians and their patients. 


Q Questions for Discussion 


1. How do you interpret the phrase 
“logical behavior”? Do computers 
behave logically? Do people behave 
logically? Explain your answers. 

2. What do you think it means to say that 
a computer program is “effective”? 
Make a list of a dozen computer appli- 
cations with which you are familiar. List 
the applications in decreasing order of 
effectiveness, as you have explained this 
concept. Then, for each application, 
indicate your estimate of how well 
human beings perform the same tasks 
(this will require that you determine 
what it means for a human being to be 
effective). Do you discern any pattern? 
If so, how do you interpret it? 

3. Discuss three society-wide factors that 
will determine the extent to which 
computers are assimilated into clinical 
practice. 

4. Reread the future vision presented in 
> Sect. 1.1. Describe the characteris- 
tics of an integrated environment for 
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managing clinical information. Discuss 
two ways (either positive or negative) 
in which such a system could change 
clinical practice. 
Do you believe that improving the tech- 
nical quality of health care entails the 
risk of dehumanization? If so, is it 
worth the risk? Explain your reasoning. 
Consider @ Fig. 1.20, which shows 
that bioinformatics, imaging informat- 
ics, clinical informatics, and public 
health informatics are all application 
domains of the biomedical informatics 
discipline because they share the same 
core methods and theories: 

(a) Briefly describe two examples of 
core biomedical informatics 
methods or theories that can be 
applied both to bioinformatics 
and clinical informatics. 

(b) Imagine that you describe 
O Fig. 1.20 to a mathematics fac- 
ulty member, who responds that 
“in that case, Pd also argue that 
statistics, computer science, and 
physics are all application domains 
of math because they share the 
same core mathematical methods 
and theories.” In your opinion, is 
this a legitimate argument? In 
what ways is this situation similar 
to, and different from, the case of 
biomedical informatics? 

(c) Why is biomedical informatics not 
simply computer science applied to 
biomedicine, or to the practice of 
medicine, using computers? 

(d) How would you describe the 
relevance of psychology and 
cognitive science to the field of 
biomedical informatics? (Hint: See 
B Fig. 1.24) 

In 2000, a major report by the Institute 

of Medicine?’ entitled “To Err is 

Human: Building a Safer Health 

System” (see Suggested Readings) stated 

that up to 98,000 patient deaths were 

being caused by preventable medical 
errors in American hospitals each year. 

(a) It has been suggested that effective 
electronic health record (EHR) 
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systems should mitigate this 
problem. What are three specific 
ways in which they could be 
reducing the number of adverse 
events in hospitals? 

(b) Are there ways in which computer- 
based systems could increase the 
incidence of medical errors? 
Explain. 

(c) Describe a practical experiment that 
could be used to examine the impact 
of an EHR system on patient safety. 
In other words, propose a study 
design that would address whether 
the computer-based system increases 
or decreases the incidence of pre- 
ventable adverse events in hospi- 
tals — and by how much. 

(d) What are the limitations of the 
experimental design you proposed 
in (c)? 


8. It has been argued that the ability to 
capture “nuance” in the description of 
what a clinician has seen when 
examining or interviewing a patient 
may not be as crucial as some people 
think. The desire to be able to express 
one’s thoughts in an unfettered way 
(free text) is often used to argue against 
the use of structured data-entry 
methods using a controlled vocabulary 
and picking descriptors from lists 
when recording information in an 
EHR. 

(a) What is your own view of this 
argument? Do you believe that it is 
important to the quality and/or 
efficiency of care for clinicians to 
be able to record their observations, 
at least part of the time, using free 
text/natural language? 

(b) Many clinicians have been 
unwilling to use an EHR system 
requiring structured data entry 


28 The Institute of Medicine (IOM), part of the former 
National Academy of Sciences (NAS) was reorga- 
nized in 2015 to become the National Academy of 
Medicine (NAM). The NAS is now known as the 
National Academies of Science, Engineering, and 
Medicine (NASEM). 
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because of the increased time 
required for documentation at the 
point of care and constraints on 
what can be expressed. What are 
two strategies that could be used 
to address this problem (other 
than “designing a better user 
interface for the system”)? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What are clinical data? 

= How are clinical data used? 

= What are the advantages and disadvan- 
tages of traditional paper medical records 
vs. electronic health records? 

= What is the role of the computer in data 
storage, retrieval, and interpretation? 

= What distinguishes a database from a 
knowledge base? 

= How are data collection and hypothesis 
generation intimately linked in clinical 
diagnosis? 

= What are the meanings of the terms 
prevalence, predictive value, sensitivity, 
and specificity? 

= How are the terms related? 

= What are the alternatives for entry of 
data into a clinical database? 


2.1 What Are Clinical Data? 


From earliest times, the ideas of ill health and its 
treatment have been wedded to those of the 
observation and interpretation of data. Whether 
we consider the disease descriptions and guide- 
lines for management in early Greek literature 
or the modern physician’s use of complex labo- 
ratory and X-ray studies, it is clear that gather- 
ing data and interpreting their meaning are 
central to the health care process. With the move 
toward the use of clinical and genomic informa- 
tion in assessing individual patients (their risks, 
prognosis, and likely responses to therapy), the 
sheer amounts of data that may be used in 
patient care have become huge. A textbook on 
biomedical informatics will accordingly refer 
time and again to issues in data collection, stor- 
age, and use. This chapter lays the foundation 
for this recurring set of issues that is pertinent to 
all aspects of the use of information, knowl- 
edge, and computers in biomedicine, both in the 
clinical world and in applications related to pub- 
lic health, biology and human genetics. 

If data are central to all health care, it is 
because they are crucial to the process of deci- 
sion making (as described in detail in » Chaps. 
3 and 4 and again in > Chap. 26). In fact, sim- 
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ple reflection will reveal that all health care 
activities involve gathering, analyzing, or using 
data. Data provide the basis for categorizing 
the problems a patient may be having or for 
identifying subgroups within a population of 
patients. They also help a physician to decide 
what additional information is needed and 
what actions should be taken to gain a greater 
understanding of a patient’s problem or most 
effectively to treat the problem that has been 
diagnosed. 

It is overly simplistic to view data as the 
columns of numbers or the monitored wave- 
forms that are a product of our technological 
health care environment. Although laboratory 
test results and other numeric data are often 
invaluable, a variety of more subtle types of 
data may be just as important to the delivery 
of optimal care: the awkward glance by a 
patient who seems to be avoiding a question 
during the medical interview, information 
about the details of a patient’s symptoms or 
about his family or economic setting, or the 
subjective sense of disease severity that an 
experienced clinician will often have within a 
few moments of entering a patient’s room. No 
clinician disputes the importance of such 
observations in decision making during patient 
assessment and management, yet the precise 
role of these data and the corresponding deci- 
sion criteria are so poorly understood that it is 
difficult to record them in ways that convey 
their full meaning, even from one clinician to 
another. Despite these limitations, clinicians 
need to share descriptive information with 
others. When they cannot interact directly 
with one another, they often turn to the chart 
or electronic health record for communication 
purposes. 

We consider a clinical datum to be any sin- 
gle observation of a patient—e.g., a tempera- 
ture reading, a red blood cell count, a past 
history of rubella, or a blood pressure reading. 
As the blood pressure example shows, it is a 
matter of perspective whether a single observa- 
tion is in fact more than one datum. A blood 
pressure of 120/80 might well be recorded as a 
single element in a setting where knowledge 
that a patient’s blood pressure is normal is all 
that matters. If the difference between diastolic 
(while the heart cavities are beginning to fill) 
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and systolic (while they are contracting) blood 
pressures is important for decision making or 
for analysis, however, the blood pressure read- 
ing is best viewed as two pieces of information 
(systolic pressure = 120 mmHg, diastolic pres- 
sure = 80 mmHg). Human beings can glance at 
a written blood pressure value and easily make 
the transition between its unitary view as a sin- 
gle data point and the decomposed informa- 
tion about systolic and diastolic pressures. 
Such dual views can be much more difficult for 
computers, however, unless they are specifically 
allowed for in the design of the method for 
data storage and analysis. The idea of a data 
model for computer-stored medical data 
accordingly becomes an important issue in the 
design of medical data systems. 

Clinical data may involve several different 
observations made concurrently, the observa- 
tion of the same patient parameter made at 
several points in time, or both. Thus, a single 
datum generally can be viewed as defined by 
five elements: 

1. The patient in question 

2. The parameter being observed (e.g., liver 
size, urine sugar value, history of rheumat- 
ic fever, heart size on chest X-ray film) 

3. The value of the parameter in question (e.g., 
weight is 70 kg, temperature is 98.6 °F, pro- 
fession is steel worker) 

4. The time of the observation (e.g., 2:30 
A.M. on 14FEB2019') 

5. The method by which the observation was 
made (e.g., patient report, thermometer, 
urine dipstick, laboratory instrument). 


Time can particularly complicate the assess- 
ment and computer-based management of 
data. In some settings, the date of the obser- 
vation is adequate—e.g., in outpatient clinics 
or private offices where a patient generally is 
seen infrequently and the data collected need 
to be identified in time with no greater accu- 
racy than a calendar date. In others, minute- 


1 Note that it was the tendency to record such dates 
in computers as “14FEB12” that led to the 
end-of-century complexities that were called the 
Year 2K problem. It was shortsighted to think that 
it was adequate to encode the year of an event with 
only two digits. 


to-minute variations may be important—e.g., 
the frequent blood sugar readings obtained 
for a patient in diabetic ketoacidosis (acid 
production due to poorly controlled blood 
sugar levels) or the continuous measurements 
of mean arterial blood pressure for a patient 
in cardiogenic shock (dangerously low blood 
pressure due to failure of the heart muscle). 

It may also be important to keep a record 
of the circumstances under which a data point 
was obtained. For example, was the blood 
pressure taken in the arm or leg? Was the 
patient lying or standing? Was the pressure 
obtained just after exercise? During sleep? 
What kind of recording device was used? Was 
the observer reliable? Such additional infor- 
mation, sometimes called contexts, methods, 
or modifiers, can be of crucial importance in 
the proper interpretation of data. Two patients 
with the same basic problem or symptom 
often have markedly different explanations for 
their problem, revealed by careful assessment 
of the modifiers of that problem. 

A related issue is the uncertainty in the val- 
ues of data. It is rare that an observation— 
even one by a skilled clinician—can be 
accepted with absolute certainty. Consider the 
following examples: 
= An adult patient reports a childhood ill- 

ness with fevers and a red rash in addition 

to joint swelling. Could he or she have had 
scarlet fever? The patient does not know 
what his or her pediatrician called the dis- 
ease nor whether anyone thought that he 
or she had scarlet fever. 

= A physician listens to the heart of an asth- 

matic child and thinks that she hears a 

heart murmur—but is not certain because 

of the patient’s loud wheezing. 

= A radiologist looking at a shadow on a 
chest X-ray film is not sure whether it rep- 
resents overlapping blood vessels or a lung 
tumor. 

= A confused patient is able to respond to 
simple questions about his or her illness, 
but under the circumstances the physician 
is uncertain how much of the history being 
reported is reliable. 


As described in » Chaps. 3 and 4, there are a 
variety of possible responses to deal with 
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incomplete data, the uncertainty in them, and 
in their interpretation. One technique is to col- 
lect additional data that will either confirm or 
eliminate the concern raised by the initial 
observation. This solution is not always appro- 
priate, however, because the costs of data col- 
lection must be considered. The additional 
observation might be expensive, risky for the 
patient, or wasteful of time during which treat- 
ment could have been instituted. The idea of 
trade-offs in data collection thus becomes 
extremely important in guiding health care 
decision making. 


2.1.1 What Are the Types of Clinical 
Data? 


The examples in the previous section suggest 
that there is a broad range of data types in the 
practice of medicine and the allied health sci- 
ences. They range from narrative, textual data 
to numerical measurements, genetic informa- 
tion, recorded signals, drawings, and photo- 
graphs or other images. 

Narrative data account for a large compo- 
nent of the information that is gathered in the 
care of patients. For example, the patient’s 
description of his or her present illness, includ- 
ing responses to focused questions from the 
physician, generally is gathered verbally and is 
recorded as text in the medical record. The 
same is true of the patient’s social and family 
history, the general review of systems that is 
part of most evaluations of new patients, and 
the clinician’s report of physical examination 
findings. Such narrative data were traditionally 
handwritten by clinicians and then placed in 
the patients medical record (@ Fig. 2.1a). 
Increasingly, however, the narrative summaries 
were dictated and then transcribed by typists 
who produced printed summaries or electronic 
copies for inclusion in paper or electronic med- 
ical records. Now, physicians and staff largely 
enter narrative text directly into electronic 
health records (EHRs), usually through key- 
board, mouse-driven, or voice-driven interfaces 
(O Fig. 2.1b). Electronic narrative data often 
include not only patient histories and physical 
examinations, but also other narrative descrip- 
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tions such as reports of specialty consultations, 
surgical procedures, pathologic examinations 
of tissues, and hospitalization summaries when 
a patient is discharged. 

Some narrative data are loosely coded with 
shorthand conventions known to health per- 
sonnel, particularly data collected during the 
physical examination, in which recorded obser- 
vations reflect the stereotypic examination pro- 
cess taught to all practitioners. It is common, 
for example, to find the notation “PERRLA” 
under the eye examination in a patient’s medi- 
cal record. This encoded form indicates that 
the patient’s “Pupils are Equal (in size), Round, 
and Reactive to Light and Accommodation 
(the process of focusing on near objects).” 

Note that there are significant problems 
associated with the use of such abbreviations. 
Many are not standard and can have different 
meanings depending on the context in which 
they are used. For example, “MI” can mean 
“mitral insufficiency” (leakage in one of the 
heart’s valves) or “myocardial infarction” (the 
medical term for what is commonly called a 
heart attack). Many hospitals try to establish a 
set of “acceptable” abbreviations with mean- 
ings, but the enforcement of such standardiza- 
tion is often unsuccessful. Other hospitals 
approach this challenge by not permitting use of 
abbreviations in the medical record, and instead 
require use of full-length narrative descriptions. 

Standard narrative expressions have often 
become loose standards of communication 
among medical personnel. Examples include 
“mild dyspnea (shortness of breath) on exer- 
tion,” “pain relieved by antacids or milk,” and 
“failure to thrive.” Such standardized expres- 
sions are attempts to use conventional text 
notation as a form of summarization for oth- 
erwise heterogeneous conditions that together 
characterize a simple concept about a patient. 

Many data used in medicine take on discrete 
numeric values. These include such parameters 
as laboratory tests, vital signs (such as tempera- 
ture and pulse rate), and certain measurements 
taken during the physical examination. When 
such numerical data are interpreted, however, 
the issue of precision becomes important. Can 
physicians distinguish reliably between a 9-cm 
and a 10-cm liver span when they examine a 
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O Fig. 2.1 Much of the information gathered during a 


physician-patient encounter is written in the medical 


record. This was traditionally done using a paper notes, and now increasingly using b electronic health records 


patient’s abdomen? Does it make sense to report 
a serum sodium level to two-decimal-place accu- 
racy? Isa 1-kg fluctuation in weight from 1 week 
to the next significant? Was the patient weighed 
on the same scale both times (1.e., could the dif- 
ferent values reflect variation between measure- 


ment instruments rather than changes in the 
patient)? 

In some fields of medicine, analog data in 
the form of continuous signals are particular- 
ly important (see > Chap. 23). Perhaps the 
best-known example is an electrocardiogram 
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Patient presents for followup following recent pneumonia. He is feeling much improved although still complains of mild fatigue and occasional 
cough productive of clear to cloudy white sputum. He lost about five pounds, but has gained about half of that back again 


Patient admits to being poorly compliant with his hypertension regimen over the past several months. He was concerned that the medication 
was making him tired and cut back his dosage to every other day. He has not been monitoring his blood pressure at home. He denies 


Palpitations, anginal symptoms, or peripheral edema 


Patient indicates that he has been taking his thyroid medication regularly. Review of his prescriptions suggested he should have run out but 
he indicates that he had some extra tablets on hand. He is clinically euthyroid. Denies problems with constipation, dry skin, or unusual cold 


intolerance. 


Current prescriptions 


CARDIZEM CD CPCR 240 MG/24HR OR, 1 CAPSULE DAILY, D: 60, R: 5 


SYNTHROID TABS 0.1 MG OR, 1 TABLET DAILY, D: 60, R: 5 
PEPCID TABS 20 MG OR, 1 TABLET AT BEDTIME PRN, D: 30, R: 3 


Review of patient's allergies indicates 
Penicillins Hives 


OBJECTIVE 
BP 150/98 | Pulse 88 | Temp 99.1 | Resp 16 | Wt 182 Ibs (82.55 kg) 


Alert, cooperative, and well hydrated. TMs clear. Nose clear rhinorrhea. Oropharynx displays moderate erythema. Neck supple without 
adenopathy. Once a few scattered expiratory rhonchi which partially clear with coughing. Heart regular rate and rhythm without extra signs or 


murmurs. Abdomen soft, nontender, no organomegaly 


O Fig.2.1 (continued) 


(ECG), a tracing of the electrical activity from 
a patient’s heart. When such data are stored in 
medical records, a graphical tracing frequent- 
ly is included, with a written interpretation of 
its meaning. There are clear challenges in 
determining how such data are best managed 
in computer-based storage systems. 

Visual images—acquired from machines 
or sketched by the physician—are another 
important category of data. Radiologic imag- 
es or photographs of skin lesions are obvious 
examples. It has traditionally been common 
for physicians to draw simple pictures to rep- 
resent abnormalities that they have observed; 
such drawings may serve as a basis for com- 
parison when they or another physician next 
see the patient. For example, a sketch is a con- 
cise way of conveying the location and size of 
a nodule in the prostate gland (@ Fig. 2.2). In 
electronic health record systems, these hand 
drawings are increasingly being replaced in 
the medical record by text-based descriptions 
or photographs (Sanders et al. 2013). 

As should be clear from these examples, the 
idea of data is inextricably bound to the idea of 
data recording. Physicians and other health care 
personnel are taught from the outset that it is 
crucial that they do not trust their memory 
when caring for patients. They must record their 
observations, as well as the actions they have 


taken and the rationales for those actions, for 
later communication to themselves and other 
people. A glance at a medical record will quick- 
ly reveal the wide variety of data-recording 
techniques that have evolved. The range goes 
from narrative text to commonly understood 
shorthand notation to cryptic symbols that only 
specialists can understand; for example, few 
physicians without specialized training know 
how to interpret the data-recording conventions 
of an ophthalmologist (@ Fig. 2.3). The nota- 
tions may be highly structured records with 
brief text or numerical information, machine- 
generated tracings of analog signals, photo- 
graphic images (of the patient or of radiologic 
or other studies), or drawings. This range of 
data-recording conventions presents significant 
challenges to the person implementing electron- 
ic health record systems. 


2.1.2 Who Collects the Data? 


Health data on patients and populations are 
gathered by a variety of health professionals. 
Although conventional ideas of the healthcare 
team evoke images of coworkers treating ill 
patients, the team has much broader responsi- 
bilities than treatment per se; data collection 
and recording are a central part of its task. 
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O Fig.2.2 A physician’s hand-drawn sketch of a pros- 
tate nodule. Drawings may convey precise information 
more easily and compactly than a textual description, 
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but are less common in electronic health records com- 
pared to paper charts 
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O Fig. 2.3 An ophthalmologists report of an eye 
examination. Most physicians trained in other special- 
ties would have difficulty deciphering the symbols that 


Physicians are key players in the process of 
data collection and interpretation. They con- 
verse with a patient to gather narrative descrip- 
tive data on the chief complaint, past illnesses, 
family and social information, and the system 
review. They examine the patient, collecting 
pertinent data and recording them during or at 
the end of the visit. In addition, they generally 
decide what additional data to collect by order- 
ing laboratory or radiologic studies and by 
observing the patient’s response to therapeutic 
interventions (yet another form of data that 
contributes to patient assessment). 

In both outpatient and hospital settings, 
nurses play a central role in making observa- 
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the ophthalmologist has used. (Image courtesy of Nita 
Valikodath, MD, with permission) 


tions and recording them for future reference. 
The data that they gather contribute to nursing 
care plans as well as to the assessment of 
patients by physicians and by other health care 
staff. Thus, nurses’ training includes instruc- 
tion in careful and accurate observation, his- 
tory taking, and examination of the patient. 
Because nurses typically spend more time with 
patients than physicians do, especially in the 
hospital setting, nurses often build relation- 
ships with patients that uncover information 
and insights that contribute to proper diagno- 
sis, to understanding of pertinent psychosocial 
issues, or to proper planning of therapy or dis- 
charge management (@ Fig. 2.4). The role of 
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O Fig. 2.4 Nurses often develop close relationships 
with patients. These relationships may allow the nurse to 
make observations that are missed by other staff. This 
ability is just one of the ways in which nurses play a key 
role in data collection and recording. (Photograph cour- 
tesy of Susan Ostmo, with permission) 


information systems in contributing to patient 
care tasks such as care planning by nurses is the 
subject of > Chap. 19. 

Various other health care workers contrib- 
ute to the data-collection process. Office staff 
and admissions personnel gather demographic 
and financial information. Physical or respira- 
tory therapists record the results of their treat- 
ments and often make suggestions for further 
management. Laboratory personnel perform 
tests on biological samples, such as blood or 
urine, and record the results for later use by 
physicians and nurses. Radiology technicians 
perform X-ray examinations; radiologists inter- 
pret the resulting data and report their findings 
to the patients’ physicians. Pharmacists may 
interview patients about their medications or 
about drug allergies and then monitor the 
patients’ use of prescription drugs. Increasingly, 
health professionals such as physician assis- 
tants, nurse practitioners, nurse anesthetists, 
nurse midwives, psychologists, chiropractors, 
and optometrists are assuming patient care 
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responsibilities. As these examples suggest, 
many different individuals employed in health 
care settings gather, record, and make use of 
patient data in their work. 

Finally, there are the technological devices 
that generate data—laboratory instruments, 
imaging machines, monitoring equipment in 
intensive care units, and measurement devices 
that take a single reading (such as thermome- 
ters, ECG machines, sphygmomanometers for 
taking blood pressure, and spirometers for 
testing lung function). Sometimes such a device 
produces a paper report suitable for inclusion 
in a traditional medical record. Sometimes the 
device indicates a result on a gauge or traces a 
result that must be read by an operator and 
then recorded in the patient’s chart. Sometimes 
a trained specialist must interpret the output. 
Increasingly, however, the devices feed their 
results directly into computer equipment so 
that the data can be analyzed or formatted for 
electronic storage in the electronic health 
record (see » Chap. 16), thereby allowing 
access to information is through computer 
workstations, hand-held tablets, or even mobile 
devices. 


2.2 Uses of Health Data 


Health data are recorded for a variety of pur- 
poses. Clinical data may be needed to support 
the proper care of the patient from whom they 
were obtained, but they also may contribute to 
the good of society through the aggregation 
and analysis of data regarding populations of 
individuals (supporting clinical research or 
public health assessments; see > Chaps. 20 and 
28). Traditional data-recording techniques and 
a paper record may have worked reasonably 
well when care was given by a single physician 
over the life of a patient. However, given the 
increased complexity of modern health care, 
the broadly trained team of individuals who 
are involved in a patient’s care, the need for 
multiple providers to access a patient’s data and 
to communicate effectively with one another 
through the chart, and the need for aggregating 
clinical data from multiple individuals to sup- 
port population health, the electronic health 
record has created new possibilities for improv- 
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ing the health care delivery process that were 
not feasible a generation ago. We will discuss 
these topics in more detail later in this chapter 
and in > Chaps. 16 and 20. 


Create the Basis 
for the Historical Record 


2.2.1 


Any student of science learns the importance of 

collecting and recording data meticulously when 

carrying out an experiment. Just as a scientific 

laboratory notebook provides a record of pre- 

cisely what an investigator has done, the experi- 

mental data observed, and the rationale for 

intermediate decision points, medical records 

are intended to provide a detailed compilation 

of information about individual patients: 

= What is the patient’s history (development 
of a current illness; other diseases that coex- 
ist or have resolved; pertinent family, social, 
and demographic information)? 

= What symptoms has the patient reported? 
When did they begin, what has seemed to 
aggravate them, and what has provided 
relief? 

= What physical signs have been noted on 
examination? 

= How have signs and symptoms changed 
over time? 

= What laboratory results have been, or are 
now, available? 

= What radiologic and other special studies 
have been performed? 

= What medications are being taken and are 
there any allergies? 

= What other interventions have been under- 
taken? 

= What is the reasoning behind the manage- 
ment decisions? 


Each new patient problem and its manage- 
ment can be viewed as a therapeutic experi- 
ment, inherently confounded by uncertainty, 
with the goal of answering three questions 
when the experiment is over: 

1. What was the nature of the disease or 

symptom? 
2. What was the treatment decision? 
3. What was the outcome of that treatment? 


As is true for all experiments, one purpose is to 
learn from experience through careful observa- 
tion and recording of data. The lessons learned 
in a given encounter may be highly individual- 
ized (e.g., the physician may learn how a spe- 
cific patient tends to respond to pain or how 
family interactions tend to affect the patient’s 
response to disease). On the other hand, the 
value of some experiments may be derived only 
by pooling of data from many patients who 
have similar problems and through the analysis 
of the results of various treatment options to 
determine efficacy. 

Although laboratory research has contrib- 
uted dramatically to our knowledge of human 
disease and treatment, it is careful observation 
and recording by skilled health care personnel 
that has always been fundamental to the effec- 
tive generation of new knowledge about patient 
care. We learn from the aggregation of infor- 
mation from large numbers of patients; thus, 
the historical record for individual patients is 
of inestimable importance to clinical research. 


2.2.2 Support Communication 
Among Providers 


A central function of structured data collec- 
tion and recording in health care settings is to 
assist personnel in providing coordinated care 
to a patient over time. Most patients who have 
significant medical conditions are seen over 
months or years on several occasions for one or 
more problems that require ongoing evaluation 
and treatment. Given the increasing numbers 
of elderly patients in many cultures and health 
care settings, the care given to a patient is less 
oriented to diagnosis and treatment of a single 
disease episode and increasingly focused on 
management of one or more chronic disor- 
ders—possibly over many years. 

It was once common for patients to receive 
essentially all their care from a single provider: 
the family doctor who tended both children and 
adults, often seeing the patient over many or all 
the years of that person’ life. We tend to picture 
such physicians as having especially close rela- 
tionships with their patients—knowing the fam- 
ily and sharing in many of the patient’s life 
events, especially in smaller communities. Such 
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doctors nonetheless kept records of all encoun- 
ters so that they could refer to data about past 
illnesses and treatments as a guide to evaluating 
future care issues. 

In the world of modern medicine, the 
emergence of subspecialization and the 
increasing provision of care by teams of health 
professionals have placed new emphasis on the 
central role of the medical record. Over the 
past several decades, shared access to a paper 
chart (@ Fig. 2.5) has largely been replaced by 
clinicians accessing electronic records, some- 
times conferring as they look at the same com- 
puter screen (@ Fig. 2.6). Now the record not 
only contains observations by a physician for 
reference on the next visit but also serves as a 
communication mechanism among physicians 
and other medical personnel, such as physical 
or respiratory therapists, nursing staff, radiol- 
ogy technicians, social workers, or discharge 
planners. In many outpatient settings, patients 
receive care over time from a variety of physi- 
cians—colleagues covering for the primary 
physician, or specialists to whom the patient 
has been referred, or a managed care organiza- 
tion’s case manager. It is not uncommon to 
hear complaints from patients who remember 
the days when it was possible to receive essen- 
tially all their care from a single physician 
whom they had come to trust and who knew 
them well. Physicians are sensitive to this issue 


O Fig.2.5 One role of the medical record: a communi- 
cation mechanism among health professionals who 
work together to plan patient care. (Photograph cour- 
tesy of Janice Anne Rohn) 


O Fig.2.6 Today similar communication sessions occur 
around a computer screen rather than a paper chart (See 
O Fig. 2.5). (Photograph courtesy of Susan Ostmo with 
permission) 


and therefore recognize the importance of the 
medical record in ensuring quality and conti- 
nuity of care through adequate recording of 
the details and logic of past interventions and 
ongoing treatment plans. This idea is of par- 
ticular importance in a health care system in 
which chronic diseases rather than care for 
trauma or acute infections increasingly domi- 
nate the basis for interactions between patients 
and their doctors. 


2.2.3 Anticipate Future Health 
Problems 


Providing high-quality health care involves 
more than responding to patients’ acute or 
chronic health problems. It also requires edu- 
cating patients about the ways in which their 
environment and lifestyles can contribute to, or 
reduce the risk of, future development of dis- 
ease. Similarly, data gathered routinely in the 
ongoing care of a patient may suggest that he 
or she is at high risk of developing a specific 
problem even though he or she may feel well 
and be without symptoms at present. Clinical 
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data therefore are important in screening for 
risk factors, following patients’ risk profiles 
over time, and providing a basis for specific 
patient education or preventive interventions, 
such as diet, medication, or exercise. Perhaps 
the most common examples of such ongoing 
risk assessment in our society are routine mon- 
itoring for excess weight, high blood pressure, 
and elevated serum cholesterol levels. In these 
cases, abnormal data may be predictive of later 
symptomatic disease; optimal care requires 
early intervention before the complications 
have an opportunity to develop fully. 


2.2.4 Record Standard Preventive 
Measures 


The medical record also serves as a source of 
data on interventions that have been performed 
to prevent common or serious disorders. Some- 
times the interventions involve counseling or 
educational programs (for example, regarding 
smoking cessation, measures for stopping drug 
abuse, safe sex practices, or dietary changes). 
Other important preventive interventions 
include immunizations: the vaccinations that 
begin in early childhood and continue through- 
out life, including special treatments adminis- 
tered when a person will be at particularly high 
risk (e.g., injections to protect people from cer- 
tain highly communicable diseases, administered 
before travel to areas where such diseases are 
endemic). When a patient comes to his local hos- 
pital emergency room with a laceration, the phy- 
sicians routinely check for an indication of when 
he most recently had a tetanus immunization. 
When easily accessible in the record (or from the 
patient), such data can prevent unnecessary 
treatments (in this case, a repeat injection) that 
may be associated with risk or significant cost. 


2.2.5 Identify Deviations 
from Expected Trends 


Data often are useful in medical care only 
when viewed as part of a continuum over 
time. An example is the routine monitoring of 


children for normal growth and development 
by pediatricians (@ Fig. 2.7). Single data 
points regarding height and weight may have 
limited use by themselves; it is the trend in 
such data points observed over months or 
years that may provide the first clue to a med- 
ical problem. It is accordingly common for 
such parameters to be recorded on special 
charts or forms that make the trends easy to 
discern at a glance. Women who want to have 
a child often keep similar records of body 
temperature. By measuring temperature daily 
and recording the values on special charts, 
women can identify the slight increase in tem- 
perature that accompanies ovulation and thus 
may discern the days of maximum fertility. 
Many physicians will ask a patient to keep 
such graphical records so that they can later 
discuss the data with the patient and include 
the scanned or photographed graph in the 
electronic record for ongoing reference. Such 
graphs are increasingly captured and dis- 
played for viewing by clinicians as a feature of 
a patient’s medical record. 


2.2.6 Provide a Legal Record 


Another use of health data, once they are 
charted and analyzed, is as the foundation for 
a legal record to which the courts can refer if 
necessary. The medical record is a legal docu- 
ment; the responsible individual must certify 
or sign most of the clinical information that is 
recorded. In addition, the chart generally 
should describe and justify both the presumed 
diagnosis for a patient and the choice of man- 
agement. 

We emphasized earlier the importance of 
recording data; in fact, data do not exist in a 
generally useful form unless they are record- 
ed. The legal system stresses this point as well. 
Providers’ unsubstantiated memories of what 
they observed or why they took some action 
are of little value in the courtroom. The medi- 
cal record is the foundation for determining 
whether proper care was delivered. Thus, a 
well-maintained record is a source of protec- 
tion for both patients and their physicians. 
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the National Center for Chronic Disease Prevention and Health Promotion (2000). 


SOURCE: Developed by the National Center for Health Statistics in collaboration with 
http://www.cde.gov/growthcharts 


O Fig. 2.7 A pediatric growth chart. Single data points 
would not be useful; it is the changes in values over time 
that indicate whether development is progressing normally. 
(Source: National Center for Health Statistics in collabora- 
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2.2.7 Support Clinical Research 


Although experience caring for individual 
patients provides physicians with special skills 
and enhanced judgment over time, it is only 
by formally analyzing data collected from 
large numbers of patients that researchers can 
develop and validate new clinical knowledge 
of general applicability. Thus, another use of 
clinical data is to support research through 
the aggregation and statistical or other analy- 
sis of observations gathered from populations 
of patients (see > Chap. 1). 

A randomized clinical trial (RCT) (see also 
> Chaps. 15 and 29) is a common method by 
which specific clinical questions are addressed 
experimentally. RCTs typically involve the ran- 
dom assignment of matched groups of patients 
to alternate treatments when there is uncertain- 
ty about how best to manage the patients’ prob- 
lem. The variables that might affect a patient’s 
course (e.g., age, gender, weight, coexisting 
medical problems) are measured and recorded. 
As the study progresses, data are collected 
meticulously to provide a record of how each 
patient fared under treatment and precisely 
how the treatment was administered. By pool- 
ing such data, sometimes after years of experi- 
mentation (depending on the time course of the 
disease under consideration), researchers may 
be able to demonstrate a statistical difference 
among the study groups depending on precise 
characteristics present when patients entered 
the study or on the details of how patients were 
managed. Such results then help investigators 
to define the standard of care for future patients 
with the same or similar problems. 

Medical knowledge also can be derived 
from the analysis of large patient data sets or 
registries, even when the patients were not spe- 
cifically enrolled in an RCT, often referred to as 
retrospective studies. Much of the research in 
the field of epidemiology involves analysis of 
population-based data of this type. Our knowl- 
edge of the risks associated with cigarette smok- 
ing, for example, is based on irrefutable statistics 
derived from large populations of individuals 
with and without lung cancer, other pulmonary 
problems, and heart disease. 


2.3 Rationale for the Transition 
from Paper to Electronic 
Documentation 


The preceding description of medical data 
and their uses emphasizes the positive aspects 
of information storage and retrieval in the 
record. During the past several decades, the 
United States and many other countries have 
gradually transitioned from traditional paper 
records to electronic health records. The ratio- 
nale for this transition has largely been to cre- 
ate the potential for enhancing the record’s 
effectiveness for its intended uses, as summa- 
rized in the previous section. 


2.3.1 Pragmatic and Logistical 
Issues 


Recall, first, that data cannot effectively serve 

the delivery of health care unless they are 

recorded. Their optimal use depends on posi- 

tive responses to the following questions: 

= Can I find the data I need when I need them? 

= Can I find the medical record in which 

they are recorded? 

Can I find the data within the record? 

Can I find what I need quickly? 

Can I read and interpret the data once I 

find them? 

= Can I update the data reliably with new 
observations in a form consistent with the 
requirements for future access by me or 
other people? 


The traditional paper record created situa- 

tions in which people too often answered such 

questions in the negative. For example: 

= The patient’s paper chart was too often 
unavailable when the health care profes- 
sional needed it. It could be in use by 
someone else at another location; it might 
have been misplaced despite the record- 
tracking system of the hospital, clinic, or 
office (@ Fig. 2.8); or it might have been 
taken by someone unintentionally and is 
now buried on a desk. 
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O Fig. 2.8 Storage room for paper-based medical 
records. These paper repositories have largely been 
replaced as EHRs have become more standard. (Photo- 
graph courtesy of Janice Anne Rohn) 


= It could be difficult to find the information 
required in either the paper or electronic 
record. The data might have been known 
previously but never recorded due to an 
oversight by a physician or other health pro- 
fessional. Poor organization or sheer size of 
either the paper or electronic record may 
lead the user to spend an inordinate time 
searching for the data, especially for patients 
who have long and complicated histories. 

= Paper records were notoriously difficult to 
read. It was not uncommon to hear one phy- 
sician asking another as they peered together 
into a chart: “What is that word?” “Is that a 
two or a five?” “Whose signature is that?” 
Illegible and sloppy entries was too often a 
major obstruction to effective use of the 
paper chart (@ Fig. 2.9). 

= When a paper chart was unavailable, the 
health care professional still had to provide 
patient care. Thus, providers would often 
make do without past data, basing their 
decisions instead on what the patient could 
tell them and on what their examination 
revealed. They then wrote a note for inclu- 
sion in the chart—when the chart was locat- 
ed! In a large institution with thousands of 
medical records, it is not surprising that 
such loose notes often failed to make it to 
the patient’s chart or were filed out of 
sequence so that the actual chronology of 
management was disrupted in the record. 
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O Fig.2.9 Written entries were standard in paper records, 
yet handwritten notes could be illegible. Notes that cannot 
be interpreted by other people due to illegibility may cause 
delays in treatment or inappropriate care—an issue that is 
largely eliminated when EHRs are used. (Image courtesy of 
Emily Cole, MD, with permission) 


= When patients who have chronic or fre- 
quent diseases are seen over months or 
years, their paper records grew so large 
that the charts had to be broken up into 
multiple volumes. When a hospital clinic 
or emergency room ordered the patient’s 
chart, only the most recent volume typi- 
cally was provided. Old but pertinent data 
might have been in early volumes that 
were stored offsite or are otherwise 
unavailable. Alternatively, an early vol- 
ume could be mistaken for the most recent 
volume, misleading its users and resulting 
in documents being inserted out of 
sequence. 


» Chapter 16 describes approaches that elec- 
tronic health record systems have taken 
toward addressing these practical problems in 
the use of the paper record. It is for this rea- 
son that almost all hospitals, health systems, 
and individual practitioners have implement- 
ed EHRs-further encouraged in the US by 
Federal incentive programs that helped to 
cover the costs of EHR acquisition and main- 
tenance (see > Chaps. | and 31). That said, 
one challenge is that electronic health records 
in the US have been criticized for being com- 
posed of bloated, lengthy documentation that 
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is often focused on billing and compliance 
over clinical care (® Chaps. 16 and 31). 


2.3.2 Redundancy and Inefficiency 


To be able to find data quickly in the medical 
record, health professionals developed a vari- 
ety of techniques in paper documentation that 
provided redundant recording to match alter- 
nate modes of access. For example, the result 
of aradiologic study typically was entered ona 
standard radiology reporting form, which was 
filed in the portion of the chart labeled “X-ray.” 
For complicated procedures, the same data 
often were summarized in brief notes by radi- 
ologists in the narrative part of the chart, 
which they entered at the time of studies 
because they knew that the formal report 
would not make it back to the chart for 1 or 
2 days. In addition, the study results often were 
mentioned in notes written by the patient’s 
admitting and consulting physicians and by 
the nursing staff. Although there may have 
been good reasons for recording such informa- 
tion multiple times in different ways and in dif- 
ferent locations within the paper chart, the 
combined bulk of these notes accelerated the 


physical growth of the document and, accord- 
ingly, complicated the chart’s logistical man- 
agement. Furthermore, it became increasingly 
difficult to locate specific patient data as the 
chart succumbed to “obesity”. The predictable 
result was that someone would write yet anoth- 
er redundant entry, summarizing information 
that it took hours to track down — and creating 
potential sources for transcription error. 

A similar inefficiency occured because of a 
tension between opposing goals in the design 
of reporting forms used by many laboratories. 
Most health personnel preferred a consistent, 
familiar form, often with color-coding, because 
it helped them to find information more quick- 
ly (© Fig. 2.10). For example, a physician might 
know that a urinalysis report form is printed on 
yellow paper and records the bacteria count 
halfway down the middle column of the form. 
This knowledge allowed the physician to work 
backward quickly in the laboratory section of 
the chart to find the most recent urinalysis sheet 
and to check at a glance the bacterial count. 
The problem is that such forms typically stored 
only sparse information. It was clearly subopti- 
mal if a rapidly growing physical chart was 
filled with sheets of paper that reported only a 
single data element. 


Ref Ra dag 4da 2wk ag 2wk ag 2wk ag 2wk a 
WHITE CELL 3.50 0.80 9.23 8.95 6.39 8.02 5.81 6.74 7.40 
COUNT K mm 
RED CELL 4.50 - 6.00 2.19 v 2.19 v 2.13 v 2.53 v 2.46v 2.48 v 2.48v 
COUNT M/cu mn 
HEMOGLOBIN 13.5 - 17.5 6.6v 6.7¥v 6.5v TTY 7.5v 7.7 8.0v 
g/dL 
HEMATOCRIT 41 22.0 vw 22.5v 22.5 v 25.4 v 24.5v 24.5v 24.6Vv 
MCV 100.54 102.74 105.64 100.44 99.6 98.8 99.2 
MCHC ) 30.0v 29.8 v 28.9 v 30.3 v 30.6 v 31.4 v 32.5 
g/d 
RDW SD 35.1 - 46.3 f 63.54 60.44 62.24 59.54 60.04 59.44 57.24 
PLATELET 150 - 400 k 191 188 146 w 150 143 v 147 w 146 wv 
COUNT mm 
MPV 9.7 - 12.3 fL 11.0 11.21 11.8 2.3 11.4 11.4 11.4 
NRBC% 0.0 - 0.39 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.00 0.00 0.00 0.00 0.00 0.00 0.00 


NRBC# 0.00 - 0.02 


cu mm 


O Fig. 2.10 Laboratory reporting forms present medical data in a consistent, familiar format (in this case a com- 
plete blood count (CBC)). (Photograph courtesy of Jimy Chen, with permission) 
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2.3.3 Influence on Clinical Research 


Anyone involved in a clinical research project 
based on retrospective data review from paper 
records can attest to the tediousness of flipping 
through myriad medical records. For all the rea- 
sons described in > Chap. 1, it is arduous to sit 
with stacks of patient records, extracting data 
and formatting them for structured statistical 
analysis, and the process is vulnerable to tran- 
scription errors. Observers often wonder how 
much medical knowledge is sitting untapped in 
old paper medical records because there is no 
easy way to analyze experience across large pop- 
ulations of patients from the past without first 
extracting pertinent data from those charts. 

Let’s contrast such retrospective review with 
paper and electronic medical records. Suppose, 
for example, that physicians on a medical con- 
sultation service notice that patients receiving a 
certain common oral medication for diabetes 
(call it drug X) seem to be more likely to have 
significant postoperative hypotension (low 
blood pressure) than do surgical patients receiv- 
ing other medications for diabetes. The doctors 
have based this hypothesis—that drug X influ- 
ences postoperative blood pressure—on only a 
few recent observations, however, so they decide 
to look into existing hospital records to see 
whether this correlation has occurred with suf- 
ficient frequency to warrant a formal investiga- 
tion. One efficient way to follow up on their 
theory from existing medical data would be to 
examine the hospital records of all patients who 
have diabetes and also have been admitted for 
surgery. The task would then be to examine 
those records (difficult and arduous with paper 
charts as will be discussed shortly, but subject to 
automated analysis in the case of EHRs) and to 
note for all patients (1) whether they were taking 
drug X when admitted and (2) whether they 
had postoperative hypotension. If the statistics 
showed that patients receiving drug X were 
more likely to have low blood pressure after sur- 
gery than were similar diabetic patients receiving 
alternate treatments, a controlled trial (prospec- 
tive observation and data gathering) might well 
be appropriate. 

Note the distinction between retrospective 
chart review to investigate a question that was 
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not a subject of study at the time the data were 
collected and prospective studies in which the 
clinical hypothesis is known in advance and the 
research protocol is designed specifically to col- 
lect future data that are relevant to the question 
under consideration (see also > Chaps. 15 and 
29). Subjects are assigned randomly to different 
study groups to help prevent researchers—who 
are bound to be biased, having developed the 
hypothesis—from unintentionally skewing the 
results by assigning a specific class of patients 
all to one group. For the same reason, to the 
extent possible, the studies are double blind; i.e., 
neither the researchers nor the subjects know 
which treatment is being administered. Such 
blinding is of course impractical when it is 
obvious to patients or physicians what therapy 
is being given (such as surgical procedures ver- 
sus drug therapy). Prospective, randomized, 
double-blind studies are considered the best 
method for determining optimal management 
of disease, but it is often impractical to carry 
out such studies, and then methods such as ret- 
rospective chart review may be used. 

Returning to our example, consider the 
problems in paper chart review that the research- 
ers used to encounter in addressing the postop- 
erative hypotension question retrospectively. 
First, they would have to identify the charts of 
interest: the subset of medical records dealing 
with surgical patients who are also diabetic. In a 
hospital record room filled with thousands of 
charts, the task of chart selection was often 
overwhelming. Medical records departments 
generally did keep indexes of diagnostic and 
procedure codes cross-referenced to specific 
patients (see > Sect. 2.5.1). Thus, it sometimes 
was possible to use such an index to find all 
charts in which the discharge diagnoses includ- 
ed diabetes and the procedure codes included 
major surgical procedures. The researcher might 
then have compiled a list of patient identifica- 
tion numbers and have the individual charts 
pulled from the file room for review. 

The researchers’ next task was to examine 
each paper chart serially to find out what treat- 
ment each patient was receiving for diabetes at 
the time of the surgery and to determine 
whether the patient had postoperative hypo- 
tension. Finding such information tended to be 
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extremely time-consuming. Where should the 
researcher look for it? The admission drug 
orders might have shown what the patient 
received for diabetes control, but it would also 
have been wise to check the medication sheets 
to see whether the therapy was also adminis- 
tered (as well as ordered) and the admission 
history to see whether a routine treatment for 
diabetes, taken right up until the patient entered 
the hospital, was not administered during the 
inpatient stay. Information about hypotensive 
episodes might be similarly difficult to locate. 
The researchers might start with nursing notes 
from the recovery room or with the anesthesi- 
ologist’s datasheets from the operating room, 
but the patient might not have been hypoten- 
sive until after leaving the recovery room and 
returning to the ward. So the nursing notes 
from the ward would need to be checked too, as 
well as vital signs sheets, physicians’ progress 
notes, and the discharge summary. 

It should be clear from this example that 
retrospective paper chart review was a labori- 
ous and tedious process and that people per- 
forming it were prone to make transcription 
errors and to overlook key data. EHRs offer 
an enormous opportunity (® Chap. 16) to 
facilitate the chart review and clinical research 
process. They have obviated the need to 
retrieve hard copy charts; instead, researchers 
are increasingly using computer-based data 
retrieval and analysis techniques to do most 
of the work (finding relevant patients, locat- 
ing pertinent data, and formatting the infor- 
mation for statistical analyses). Researchers 
can use similar techniques to harness comput- 
er assistance with data management in pro- 
spective clinical trials (> Chap. 29). 


2.3.4 The Passive Nature of Paper 
Records 


The traditional manual system has another 
limitation that would have been meaningless 
until the emergence of the computer age. A 
manual archival system is inherently passive; 
the charts sit waiting for something to be done 
with them. They are insensitive to the charac- 
teristics of the data recorded within their pages, 


such as legibility, accuracy, or implications for 
patient management. They cannot take an 
active role in responding appropriately to those 
implications. 

EHR systems have changed our perspec- 
tive on what health professionals can expect 
from the medical chart. Automated record sys- 
tems introduce new opportunities for dynamic 
responses to the data that are recorded in them. 
As described in many of the chapters to follow, 
computational techniques for data storage, 
retrieval, and analysis make it feasible to devel- 
op record systems that (1) monitor their con- 
tents and generate warnings or advice for 
providers based on single observations or on 
logical combinations of data; (2) provide auto- 
mated quality control, including the flagging of 
potentially erroneous data; or (3) provide feed- 
back on patient-specific or population-based 
deviations from desirable standards. 


2.4 New Kinds of Data 
and the Resulting Challenges 


The revolution in human genetics that emerged 
with the Human Genome Project in the 1990s 
already has had a profound effect on the diag- 
nosis, prognosis, and treatment of disease 
(Vamathevan and Birney 2017). The vast 
amounts of data that are generated in biomed- 
ical research (see ® Chaps. 11 and 28), and 
that can be pooled from patient datasets to 
support clinical research (> Chap. 29) and 
public health (> Chap. 20), have created new 
opportunities as well as challenges. Researchers 
are finding that the amount of data that they 
must manage and assess has become so large 
that they often find that they lack either the 
capabilities or expertise to handle the analytics 
that are required. This problem, sometimes 
dubbed the “big data” problem, has gathered 
the attention of government agencies as well.? 


2 Big Data Senior Steering Group. The Federal Big 
Data Research and Development Strategic Plan. 
Available at: > https://obamawhitehouse.archives. 
gov/sites/default/files/microsites/ostp/NSTC/ 
bigdatardstrategicplan-nitrd_final-051916.pdf 
(Accessed 6/28/2019). 
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Some suggest that the genetic material itself 
will become our next-generation method for 
storing large amounts of data (Erlich and 
Zielinski 2017). Data analytics, and the man- 
agement of large amounts of genomic/pro- 
teomic or clinical/public-health data, have 
accordingly become major research topics and 
key opportunities for new methodology devel- 
opment by biomedical informatics and data 
scientists (Adler-Milstein and Jha 2013; 
Brennan et al. 2018; Bycroft et al. 2018). 

The issues that arise are practical as well as 
scientifically interesting. For example, develop- 
ers of EHRs have begun to grapple with ques- 
tions regarding how they might store an 
individual’s personal genome within the elec- 
tronic health record. New standards will be 
required, and tactical questions need answer- 
ing regarding, for example, whether to store an 
entire genome or only those components (e.g., 
genetic markers) that are already reasonably 
well understood (Masys et al. 2012; Haendel 
et al. 2018). In cancer, for example, where 
mutations in cell lines can occur, an individual 
may actually have many genomes represented 
among his or her cells. These issues will 
undoubtedly influence the evolution of data 
systems and EHRs, as well as the growth of 
precision medicine (see >» Chap. 30), in the 
years ahead (Relling and Evans 2015). 


2.5 The Structure of Clinical Data 


Scientific disciplines generally develop a precise 
terminology or notation that is standardized 
and accepted by all workers in the field. Con- 
sider, for example, the universal language of 
chemistry embodied in chemical formulae, the 
precise definitions and mathematical equations 
used by physicists, the predicate calculus used 
by logicians, or the conventions for describing 
circuits used by electrical engineers. Medicine is 
remarkable for its failure to develop a widely 
accepted standardized vocabulary and nomen- 
clature, and many observers believe that a true 
“scientific” basis for the field will be impossible 
until this problem is addressed (see > Chap. 8). 
Other people argue that common references to 
the “art” of medicine reflect an important dis- 
tinction between medicine and the “hard” sci- 
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ences; these people question whether it is 
possible to introduce too much standardization 
into a field that prides itself in humanism. 

The debate has been accentuated by the 
introduction of computers for data manage- 
ment, because such machines tend to demand 
conformity to data standards and definitions. 
Otherwise, issues of data retrieval and analysis 
are confounded by discrepancies between the 
meanings intended by the observers or record- 
ers and those intended by the individuals 
retrieving information or doing data analysis. 
What is an “upper respiratory infection”? Does 
it include infections of the trachea or of the 
main stem bronchi? How large does the heart 
have to be before we can refer to “cardiomega- 
ly”? How should we deal with the plethora of 
disease names based on eponyms (e.g., Alzheim- 
er’s disease, Hodgkin’s disease) that are not 
descriptive of the illness and may not be famil- 
iar to all practitioners? What do we mean by an 
“acute abdomen”? Are the boundaries of the 
abdomen well agreed on? What are the time 
constraints that correspond to “acuteness” of 
abdominal pain? Is an “ache” a pain? What 
about “occasional” cramping? 

Imprecision and the lack of a standardized 
vocabulary are particularly problematic when 
we wish to aggregate data recorded by multiple 
health professionals or to analyze trends over 
time. Without a controlled, predefined vocabu- 
lary, data interpretation is inherently compli- 
cated, and the automatic summarization of 
data may be impossible. For example, one phy- 
sician might note that a patient has “shortness 
of breath.” Later, another physician might note 
that she has “dyspnea.” Unless these terms are 
designated as synonyms, an automated pro- 
gram will fail to indicate that the patient had 
the same problem on both occasions. 

Regardless of arguments regarding the 
“artistic” elements in medicine, the need for 
health personnel to communicate effectively is 
clear both in acute care settings and when 
patients are seen over long periods. Both high- 
quality care and scientific progress depend on 
some standardization in terminology. Other- 
wise, differences in intended meaning or in 
defining criteria will lead to miscommunication, 
improper interpretation, and potentially nega- 
tive consequences for the patients involved. 
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Given the lack of formal definitions for 
many medical terms, it is remarkable that med- 
ical workers communicate as well as they do. 
Only occasionally is the care for a patient clear- 
ly compromised by miscommunication. If 
EHRs are to become dynamic and responsive 
manipulators of patient data, however, their 
encoded logic must be able to presume a spe- 
cific meaning for the terms and data elements 
entered by the observers. This point is discussed 
in greater detail in > Chap. 8, which deals in 
part with the multiple efforts to develop health- 
care computing standards, including a shared, 
controlled terminology for biomedicine. 


2.5.1 Coding Systems 


We are used to seeing figures regarding the 
growing incidences of certain types of tumors, 
deaths from influenza during the winter 
months, and similar health statistics that we 
tend to take for granted. How are such data 
accumulated? Their role in health planning and 
health care financing is clear, but electronic 
health records provide the infrastructure for 
aggregating individual patient data to learn 
more about the health status of the populations 
in various communities (see > Chap. 20). 
Because of the needs to know about health 
trends for populations and to recognize epidem- 
ics in their early stages, there are various health- 
reporting requirements for hospitals (as well as 
other public organizations) and practitioners. 
For example, cases of gonorrhea, syphilis, and 
tuberculosis generally must be reported to local 
public-health organizations, which code the 
data to allow trend analyses over time. The Cen- 
ters for Disease Control and Prevention in 
Atlanta (CDC) then pool regional data and 
report national as well as local trends in disease 
incidence, bacterial-resistance patterns, etc. 
Another kind of reporting involves the cod- 
ing of all discharge diagnoses for hospitalized 
patients, plus coding of certain procedures (e.g., 
type of surgery) that were performed during the 
hospital stay. Such codes are reported to state 
and federal health-planning and analysis agen- 
cies and also are used internally at the institu- 
tion for case-mix analysis (determining the 
relative frequencies of various disorders in the 


hospitalized population and the average length 
of stay for each disease category), for quality 
improvement, and for research. For such data 
to be useful, the codes must be well defined as 
well as uniformly applied and accepted. 

The World Health Organization publishes 
a diagnostic coding scheme called the Interna- 
tional Classification of Disease (ICD). The 
10th revision of this standard, ICD-10-CM 
(clinical modification),* is currently in use in 
much of the world (see » Chap. 8). ICD-10- 
CM is used by all nonmilitary hospitals in the 
United States for discharge coding, and must be 
reported on the bills submitted to most insur- 
ance companies (@ Fig. 2.11). Pathologists 
have developed another widely used diagnostic 
coding scheme; originally known as System- 
atized Nomenclature of Pathology (SNOP), it 
was expanded to the Systematized Nomencla- 
ture of Medicine (SNOMED) and then merged 
with the Read Clinical Terms from Great Brit- 
ain to become SNOMED-CT (Stearns et al. 
2001; Lee et al. 2014). In recent years, support 
for SNOMED-CT was assumed by the Interna- 
tional Health Terminology Standards Develop- 
ment Organization, based in Copenhagen, now 
renamed SNOMED International and relocat- 
ed to London.* Another coding scheme, devel- 
oped by the American Medical Association, is 
the Current Procedural Terminology (CPT) 
(Hirsch et al. 2015). It is similarly widely used in 
producing bills for services rendered to patients. 
More details on such schemes are provided in 
> Chap. 8. What warrants emphasis here, how- 
ever, is the motivation for the codes’ develop- 
ment: health care personnel need standardized 
terms that can support pooling of data for anal- 
ysis and can provide criteria for determining 
charges for individual patients. 

The historical roots of a coding system 
reveal themselves as limitations or idiosyncra- 
sies when the system is applied in more general 
clinical settings. For example, ICD-10-CM was 
derived from a classification scheme developed 
for epidemiologic reporting. Consequently, it 
has over 60 separate codes for describing tuber- 


3 > http://www.icd10data.com/ (Accessed 
11/1/2019). 
4 > http://snomed.org/ (Accessed 5/6/2019). 
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J45 Asthma 
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Includes: allergic (predominantly) asthma, allergic bronchitis NOS, allergic rhinitis with asthma, atopic asthma, 
extrinsic allergic asthma, hay fever with asthma, idiosyncratic asthma, intrinsic nonallergic asthma, nonallergic 


asthma 


Use additional code to identify: exposure to environmental tobacco smoke (Z77.22), exposure to tobacco smoke 
in the perinatal period (P96.81), history of tobacco use (Z87.891), occupational exposure to environmental 
tobacco smoke (Z57.31), tobacco dependence (F17.-), tobacco use (Z72.0) 


Excludes: detergent asthma (J69.8), eosinophilic asthma (J82), lung diseases due to external agents (J60-J70), 
miner's asthma (J60), wheezing NOS (R06.2), wood asthma (J67.8), asthma with chronic obstructive pulmonary 
disease (J44.9), chronic asthmatic (obstructive) bronchitis (J44.9), chronic obstructive asthma (J44.9) 


J45.2 Mild intermittent asthma 
J45.20 Mild intermittent asthma, uncomplicated 
Mild intermittent asthma NOS 


J45.21 Mild intermittent asthma with (acute) exacerbation 
J45.22 Mild intermittent asthma with status asthmaticus 


J45.3 Mild persistent asthma 
J45.30 Mild persistent asthma, uncomplicated 
Mild persistent asthma NOS 


J45.31 Mild persistent asthma with (acute) exacerbation 
J45.32 Mild persistent asthma with status asthmaticus 
J45.4 Moderate persistent asthma 
J45.40 Moderate persistent asthma, uncomplicated 
Moderate persistent asthma NOS 
J45.41 Moderate persistent asthma with (acute) exacerbation 


J45.42 Moderate persistent asthma with status asthmaticus 


J45.5 Severe persistent asthma 
J45.50 Severe persistent asthma, uncomplicated 
Severe persistent asthma NOS 


J45.51 Severe persistent asthma with (acute) exacerbation 
J45.52 Severe persistent asthma with status asthmaticus 


J45.9 Other and unspecified asthma 
J45.90 Unspecified asthma 
Asthmatic bronchitis NOS 
Childhood asthma NOS 
Late onset asthma 


J45.901 Unspecified asthma with (acute) exacerbation 


J45.902 Unspecified asthma with status asthmaticus 


J45.909 Unspecified asthma, uncomplicated 
Asthma NOS 
J45.99 Other asthma 
J45.990 Exercise induced bronchospasm 
J45.991 Cough variant asthma 
J45.998 Other asthma 


O Fig.2.11 The subset of disease categories for asthma 
taken from ICD-10-CM. (Source: Centers for Medicare 
and Medicaid Services, US Department of Health and 


culosis infections. SNOMED versions have 
long permitted coding of pathologic findings in 
exquisite detail but only in later years began to 
introduce codes for expressing the dimensions 
of a patient’s functional status. In a particular 
clinical setting, none of the common coding 
schemes is likely to be completely satisfactory. 
In some cases, the granularity of the code will 
be too coarse; on the one hand, a hematologist 


Human Services, > https://www.cms.gov/Medicare/Cod- 
ing/ICD10/2018-ICD-10-CM-and-GEMs.html, accessed 
June 28, 2019) 


(person who studies blood diseases) may want 
to distinguish among a variety of hemoglobin- 
opathies (disorders of the structure and func- 
tion of hemoglobin) lumped under a single 
code in ICD-10-CM. On the other hand, 
another practitioner may prefer to aggregate 
many individual codes—e.g., those for active 
tuberculosis—into a single category to simplify 
the coding and retrieval of data. 
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Such schemes cannot be effective unless 
health care providers accept them. There is an 
inherent tension between the need for a coding 
system that is general enough to cover many dif- 
ferent patients and the need for precise and 
unique terms that accurately apply to a spe- 
cific patient and do not unduly constrain physi- 
cians’ attempts to describe what they observe. 
Yet if physicians view the EHR as a blank sheet 
of paper on which any unstructured informa- 
tion can be written, the data they record will 
be unsuitable for dynamic processing, clinical 
research, and health planning. The challenge is 
to learn how to meet all these needs. Researchers 
at many institutions worked for over two decades 
to develop a unified medical language system 
(UMLS), a common structure that ties together 
the various vocabularies that have been created. 
At the same time, the developers of specific ter- 
minologies are continually working to refine 
and expand their independent coding schemes 
(Humphreys et al. 1998) (see > Chap. 8). 


2.5.2 The Data-to-Knowledge 
Spectrum 


A central focus in biomedical informatics is the 
information base that constitutes the “substance 
of medicine.” Workers in the field have tried to 
clarify the distinctions among three terms fre- 
quently used to describe the content of com- 
puter-based systems: data, information, and 
knowledge (Blum 1986; Bernstam et al. 2010). 
These terms are often used interchangeably. In 
this volume, we shall refer to a datum as a single 
observational point that characterizes a rela- 
tionship. It generally can be regarded as the 
value of a specific parameter for a particular 
object (e.g., a patient) at a given point in time. 
The term information refers to analyzed data 
that have been suitably curated and organized so 
that they have meaning. Data do not constitute 
information until they have been organized in 
some way, e.g., for analysis or display. Knowledge, 
then, is derived through the formal or informal 
analysis (or interpretation) of information that 
was in turn derived from data. Thus, knowledge 
includes the results of formal studies and also 
common sense facts, assumptions, heuristics 
(strategic rules of thumb), and models—any of 


which may reflect the experience or biases of 
people who interpret the primary data and the 
resulting information. 

The observation that patient Brown has a 
blood pressure of 180/110 is a datum, as is the 
report that the patient has had a myocardial 
infarction (heart attack). When researchers 
pool such data, creating information, subse- 
quent analysis may determine that patients 
with high blood pressure are more likely to 
have heart attacks than are patients with nor- 
mal or low blood pressure. This analysis of 
organized data (information) has produced a 
piece of knowledge about the world. A physi- 
cian’s belief that prescribing dietary restriction 
of salt is unlikely to be effective in controlling 
high blood pressure in patients of low econom- 
ic standing (because the latter are less likely to 
be able to afford special low-salt foods) is an 
additional personal piece of knowledge—a heu- 
ristic that guides physicians in their decision 
making. Note that the appropriate interpreta- 
tion of these definitions depends on the con- 
text. Knowledge at one level of abstraction 
may be considered data at higher levels. A 
blood pressure of 180/110 mmHg is a raw piece 
of data; the statement that the patient has 
hypertension is an interpretation of several 
such data and thus represents a higher level of 
information. As input to a diagnostic decision 
aid, however, the presence or absence of hyper- 
tension may be requested, in which case the 
presence of hypertension is treated as a data 
item. 

A database is a collection of individual 
observations without any summarizing analy- 
sis. An EHR system is thus primarily viewed 
as a database—the place where patient data 
are stored. When properly collated and pooled 
with other data, these elements in the EHR 
provide information about the patient. A 
knowledge base, on the other hand, is a collec- 
tion of facts, heuristics, and models that can 
be used for problem solving and analysis of 
organized data (information). If the knowl- 
edge base provides sufficient structure, includ- 
ing semantic links among knowledge items, 
the computer itself may be able to apply that 
knowledge as an aid to case-based problem 
solving. Many decision-support systems have 
been called knowledge-based systems, reflect- 
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ing this distinction between knowledge bases 
and databases (see > Chap. 26). 


2.6 Strategies of Clinical Data 
Selection and Use 


It is illusory to conceive of a “complete clinical 
data set.” All medical databases, and medical 
records, are necessarily incomplete because 
they reflect the selective collection and record- 
ing of data by the health care personnel respon- 
sible for the patient. There can be marked 
interpersonal differences in both style and 
problem solving that account for variations in 
the way practitioners collect and record data 
for the same patient under the same circum- 
stances. Such variations do not necessarily 
reflect good practices, however, and much of 
medical education is directed at helping physi- 
cians and other health professionals to learn 
what observations to make, how to make them 
(generally an issue of technique), how to inter- 
pret them, and how to decide whether they 
warrant formal recording. 

An example of this phenomenon is the dif- 
ference between the first medical history, 
physical examination, and summarizing report 
developed by a medical student and the similar 
process undertaken by a seasoned clinician 
examining the same patient. Medical students 
tend to work from comprehensive mental out- 
lines of questions to ask, physical tests to per- 
form, and additional data to collect. Because 
they have not developed skills of selectivity, the 
process of taking a medical history and per- 
forming a physical examination may take more 
than 1 h, after which students develop extensive 
reports of what they observed and how they 
have interpreted their observations. It clearly 
would be impractical, inefficient, and inappro- 
priate for physicians in practice to spend this 
amount of time assessing every new patient. 
Thus, part of the challenge for the neophyte is to 
learn how to ask only the questions that are nec- 
essary, to perform only the examination compo- 
nents that are required, and to record only those 
data that will be pertinent in justifying the ongo- 
ing diagnostic approach and in guiding the 
future management of the patient. 
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What do we mean by selectivity in data col- 
lection and recording? It is precisely this pro- 
cess that often is viewed as a central part of the 
“art” of medicine, an element that accounts for 
individual styles and the sometimes marked 
distinctions among clinicians. As is discussed 
with numerous clinical examples in > Chaps. 3 
and 4, the idea of selectivity implies an ongoing 
decision-making process that guides data col- 
lection and interpretation. Attempts to under- 
stand how expert clinicians internalize this 
process, and to formalize the ideas so that they 
can better be taught and explained, are central 
in biomedical informatics research. Improved 
guidelines for such decision making, derived 
from research activities in biomedical infor- 
matics, not only are enhancing the teaching 
and practice of medicine (Shortliffe 2010) but 
also are providing insights that suggest meth- 
ods for developing computer-based decision- 
support tools. 


2.6.1 The Hypothetico-Deductive 
Approach 


Studies of clinical decision makers have shown 
that strategies for data collection and interpre- 
tation may be imbedded in an iterative process 
known as the hypothetico-deductive approach 
(Elstein et al. 1978; Kassirer and Gorry 1978). 
As medical students learn this process, their 
data collection becomes more focused and 
efficient, and their medical records become 
more compact. The central idea is one of 
sequential, staged data collection, followed by 
data interpretation and the generation of 
hypotheses, leading to hypothesis-directed 
selection of the next most appropriate data to 
be collected. As data are collected at each 
stage, they are added to the growing database 
of observations and are used to reformulate or 
refine the active hypotheses. This process is 
iterated until one hypothesis reaches a thresh- 
old level of certainty (e.g., it is proved to be 
true, or at least the uncertainty is reduced to a 
satisfactory level). At that point, a manage- 
ment, disposition, or therapeutic decision can 
be made. 

The diagram in @ Fig. 2.12 clarifies this 
process. As is shown, data collection begins 
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Select most 
likely diagnosis 


F 


Treat patient 
accordingly 


O Fig. 2.12 A schematic view of the hypothetico- 
deductive approach. The process of medical data collec- 
tion and treatment is intimately tied to an ongoing 
process of hypothesis generation and refinement. See text 


when the patient presents to the physician 
with some issue (a symptom or disease, or 
perhaps the need for routine care). The physi- 
cian generally responds with a few questions 
that allow one to focus rapidly on the nature 
of the problem. In the written report, the data 
collected with these initial questions typically 
are recorded as the patient identification, 
chief complaint, and initial portion of the his- 
tory of the present illness. Studies have shown 
that an experienced physician will have an ini- 
tial set of hypotheses (theories) in mind after 
hearing the patient’s response to the first six 
or seven questions (Elstein et al. 1978). These 
hypotheses then serve as the basis for selecting 
additional questions. As shown in @ Fig. 2.12, 
answers to these additional questions allow 
the physician to refine hypotheses about the 
source of the patient’s problem. Physicians 
refer to the set of active hypotheses as the dif- 
ferential diagnosis for a patient; the differen- 
tial diagnosis comprises the set of possible 
diagnoses among which the physician must 
distinguish to determine how best to adminis- 
ter treatment. 

Note that the question selection process is 
inherently heuristic; e.g., it is personalized and 
efficient, but it is not guaranteed to collect 
every piece of information that might be per- 


\ 


Radiologic 
studies 


for full discussion. ZD patient identification, CC chief 
complaint, HPI history of present illness, PMH past 
medical history, FH family history, Social social history, 
ROS review of systems, PE physical examination 


tinent. Human beings use heuristics all the 
time in their decision making because it often 
is impractical or impossible to use an exhaus- 
tive problem-solving approach. A common 
example of heuristic problem solving is the 
playing of a complex game such as chess. 
Because it would require an enormous amount 
of time to define all the possible moves and 
countermoves that could ensue from a given 
board position, expert chess players develop 
personal heuristics for assessing the game at 
any point and then selecting a strategy for 
how best to proceed. Differences among such 
heuristics account in part for variations in 
observed expertise. 

Physicians have developed safety mea- 
sures, however, to help them to avoid missing 
important issues that they might not discov- 
er when collecting data in a hypothesis- 
directed fashion when taking the history of 
a patient’s present illness (Pauker et al. 
1976). These measures tend to be focused in 
four general categories of questions that fol- 
low the collection of information about the 
chief complaint: past medical history, family 
history, social history, and a brief review of 
systems in which the physician asks some 
general questions about the state of health 
of each of the major organ systems in the 
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body. Occasionally, the physician discovers 
entirely new problems or finds important 
information that modifies the hypothesis list 
or modulates the treatment options available 
(e.g., if the patient reports a serious past 
drug reaction or allergy). 

When physicians have finished asking ques- 
tions, the refined hypothesis list (which may 
already be narrowed to a single diagnosis) then 
serves as the basis for a focused physical exam- 
ination. By this time, physicians may well have 
expectations of what they will find on examina- 
tion or may have specific tests in mind that will 
help them to distinguish among still active 
hypotheses about diseases based on the ques- 
tions that they have asked. Once again, as in 
the question-asking process, focused hypothe- 
sis-directed examination is augmented with 
general tests that occasionally turn up new 
abnormalities and generate hypotheses that the 
physician did not expect on the basis of the 
medical history alone. In addition, unexplained 
findings on examination may raise issues that 
require additional history taking. Thus, the 
asking of questions generally is partially inte- 
grated with the examination process. 

When physicians have completed the physi- 
cal examination, their refined hypothesis list 
may be narrowed sufficiently for them to under- 
take specific treatment. Additional data gather- 
ing may still be necessary, however. Such testing 
is once again guided by the current hypotheses. 
The options available include laboratory tests 
(of blood, urine, other body fluids, or biopsy 
specimens), radiologic studies (X-ray examina- 
tions, nuclear-imaging scans, computed tomog- 
raphy (CT) studies, magnetic resonance scans, 
sonograms, or any of a number of other imag- 
ing modalities), and other specialized tests 
(electrocardiograms (ECGs), electroencephalo- 
grams, nerve conduction studies, and many oth- 
ers), as well as returning to the patient to ask 
further questions or perform additional physi- 
cal examination. As the results of such studies 
become available, physicians constantly revise 
and refine their hypothesis list. 

Ultimately, physicians are sufficiently certain 
about the source of a patient’s problem to be 
able to develop a specific management plan. 
Treatments are administered, and the patient 
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is observed. Note data collected to measure 
response to treatment may themselves be used to 
synthesize information that affects the hypothe- 
ses about a patient’s illness. If patients do not 
respond to treatment, it may mean that their dis- 
ease is resistant to that therapy and that their 
physicians should try an alternate approach, or 
it may mean that the initial diagnosis was incor- 
rect and that physicians should consider alter- 
nate explanations for the patient’s problem. 

The patient may remain in a cycle of treat- 
ment and observation for a long time, as 
shown in © Fig. 2.12. This long cycle reflects 
the nature of chronic-disease management— 
an aspect of medical care that is accounting 
for an increasing proportion of the health 
care community’s work (and an increasing 
proportion of health care cost). Alternatively, 
the patient may recover and no longer need 
therapy, or he or she may die. Although the 
process outlined in @ Fig. 2.12 is oversimpli- 
fied in many regards, it is generally applicable 
to the process of data collection, diagnosis, 
and treatment in most areas of medicine. 

Note that the hypothesis-directed process 
of data collection, diagnosis, and treatment is 
inherently knowledge-based. It is dependent 
not only ona significant fact base that permits 
proper interpretation of data and selection of 
appropriate follow-up questions and tests but 
also on the effective use of heuristic tech- 
niques that characterize individual expertise. 

Another important issue, addressed in 
> Chap. 3, is the need for physicians to balance 
financial costs and health risks of data collec- 
tion against the perceived benefits to be gained 
when those data become available. It costs 
nothing but time to examine the patient at the 
bedside or to ask an additional question, but if 
the data being considered require, for example, 
X-ray exposure, coronary angiography, or a 
CT scan of the head (all of which have associ- 
ated risks and costs), then it may be preferable 
to proceed with treatment in the absence of full 
information. Differences in the assessment of 
cost-benefit trade-offs in data collection, and 
variations among individuals in their willing- 
ness to make decisions under uncertainty, often 
account for differences of opinion among col- 
laborating physicians. 
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2.6.2 The Relationship Between 
Data and Hypotheses 


We wrote rather glibly in > Sect. 2.6.1 about 
the “generation of hypotheses from data”; now 
we need to ask: What precisely is the nature of 
that process? As is discussed in >» Chap. 4, 
researchers with a psychological orientation 
have spent much time trying to understand 
how expert problem solvers evoke hypotheses 
(Elstein et al. 1978; Arocha et al. 2005) and the 
traditional probabilistic decision sciences have 
much to say about that process as well. We pro- 
vide only a brief introduction to these ideas 
here; they are discussed in greater detail in 
> Chaps. 3 and 4. 

When an observation evokes a hypothesis 
(e.g., when a clinical finding makes a specific 
diagnosis come to mind), the observation pre- 
sumably has some close association with the 
hypothesis. What might be the characteristics of 
that association? Perhaps the finding is almost 
always observed when the hypothesis turns out 
to be true. Is that enough to explain hypothesis 
generation? A simple example will show that 
such a simple relationship is not enough to 
explain the evocation process. Consider the 
hypothesis that a patient is pregnant and the 
observation that the patient is biologically 
female. Clearly, all pregnant patients are female. 
When a new patient is observed to be female, 
however, the possibility that the patient is preg- 
nant is not immediately evoked. Thus, female 
gender is a highly sensitive indicator of pregnan- 
cy (there is a 100% certainty that a pregnant 
patient is female), but it is not a good predictor 
of pregnancy (most females are not pregnant). 
The idea of sensitivity—the likelihood that a 
given datum will be observed in a patient with a 
given disease or condition—is an important one, 
but it will not alone account for the process of 
hypothesis generation in medical diagnosis. 

Perhaps the clinical manifestation seldom 
occurs unless the hypothesis turns out to be true; 
is that enough to explain hypothesis generation? 
This idea seems to be a little closer to the mark. 
Suppose a given datum is never seen unless a 
patient has a specific disease. For example, a Pap 
smear (a smear of cells swabbed from the cer- 
vix, at the opening to the uterus, treated with 


Papanicolaou’s stain, and then examined under 
the microscope) with grossly abnormal cells 
(called class IV findings) is never seen unless the 
woman has cancer of the cervix or uterus. Such 
tests are called pathognomonic. Not only do they 
evoke a specific diagnosis but they also immedi- 
ately prove it to be true. Unfortunately, there are 
few pathognomonic tests in medicine and they 
are often of relatively low sensitivity (that is, 
although having a particular test result makes 
the diagnosis, few patients with the condition 
may actually have that finding). 

More commonly, a feature is seen in one dis- 
ease or disease category more frequently than it 
is in others, but the association is not absolute. 
For example, there are few disease entities other 
than infections that elevate a patient’s white 
blood cell count. Certainly it is true, for example, 
that leukemia can raise the white blood cell 
count, as can the use of certain medications, but 
most patients who do not have infections will 
have normal white blood cell counts. An elevat- 
ed white count therefore does not prove that a 
patient has an infection, but it does tend to 
evoke or support the hypothesis that an infec- 
tion is present. The word used to describe this 
relationship is specificity. An observation is 
highly specific for a disease if it is generally not 
seen in patients who do not have that disease. A 
pathognomonic observation is 100% specific for 
a given disease. When an observation is highly 
specific for a disease, it tends to evoke that dis- 
ease during the diagnostic or data-gathering 
process. 

By now, you may have realized that there is 
a substantial difference between a physician 
viewing test results that evoke a disease 
hypothesis and that physician being willing to 
act on the disease hypothesis. Yet even experi- 
enced physicians sometimes fail to recognize 
that, although they have made an observation 
that is highly specific for a given disease, it 
may still be more likely that the patient has 
other diseases (and does not have the suspect- 
ed one) unless (1) the finding is pathognomon- 
ic or (2) the suspected disease is considerably 
more common than are the other diseases that 
can cause the observed abnormality. This mis- 
take is one of the most common errors of 
intuition in the medical decision-making pro- 
cess. To explain the basis for this confusion in 
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more detail, we must introduce two additional 
terms: prevalence and predictive value. 

The prevalence of a disease is simply the 
percentage of a population of interest that has 
the disease at any given time. A particular dis- 
ease may have a prevalence of only 5% in the 
general population (1 person in 20 will have the 
disease) but have a higher prevalence in a spe- 
cially selected subpopulation. For example, 
black-lung disease has a low prevalence in the 
general population but has a much higher prev- 
alence among coal miners, who develop black 
lung from inhaling coal dust. The task of diag- 
nosis therefore involves updating the probabil- 
ity that a patient has a disease from the baseline 
rate (the prevalence in the population from 
which the patient was selected) to a post-test 
probability that reflects the test results. For 
example, the probability that any given person 
in the United States has lung cancer is low (i.e., 
the prevalence of the disease is low), but the 
chance increases if his or her chest X-ray exam- 
ination shows a possible tumor. If the patient 
were a member of the population composed of 
cigarette smokers in the United States, howev- 
er, the prevalence of lung cancer would be 
higher. In this case, the identical chest X-ray 
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report would result in an even higher updated 
probability of lung cancer than it would had 
the patient been selected from the population 
of all people in the United States. 

The predictive value (PV) of a test is simply 
the post-test (updated) probability that a dis- 
ease is present based on the results of a test. If 
an observation supports the presence of a dis- 
ease, the PV will be greater than the prevalence 
(also called the pretest risk). If the observation 
tends to argue against the presence of a disease, 
the PV will be lower than the prevalence. For 
any test and disease, then, there is one PV if the 
test result is positive and another PV if the test 
result is negative. These values are typically 
abbreviated PV+ (the PV of a positive test) and 
PV- (the PV of a negative test). 

The process of hypothesis generation in 
medical diagnosis thus involves both the evo- 
cation of hypotheses and the assignment of a 
likelihood (probability) to the presence of a 
specific disease or disease category. The PV of 
a positive test depends on the test’s sensitivity 
and specificity, as well as the prevalence of the 
disease. The formula that describes the rela- 
tionship precisely is: 


(sensitivity)(prevalence) 


(sensitivity)(prevalence) + (1— specificity ) (1 — prevalence) 


There is a similar formula for defining PV- in 
terms of sensitivity, specificity, and prevalence. 
Both formulae can be derived from simple prob- 
ability theory. Note that positive tests with high 
sensitivity and specificity may still lead to a low 
post-test probability of the disease (PV+) if the 
prevalence of that disease is low. You should 
substitute values in the PV+ formula to convince 
yourself that this assertion is true. It is this rela- 
tionship that tends to be poorly understood by 
practitioners and that often is viewed as counter- 
intuitive (which shows that your intuition can 
misguide you!). Note also (by substitution into 
the formula) that test sensitivity and disease 
prevalence can be ignored only when a test is 
pathognomonic (i.e., when its specificity is 100%, 
which mandates that PV+ be 100%). The PV+ 
formula is one of many forms of Bayes’ theorem, 


a rule for combining probabilistic data that is 
generally attributed to the work of Reverend 
Thomas Bayes in the 1700s. Bayes’ theorem is 
discussed in greater detail in > Chap. 3. 


2.6.3 Methods for Selecting 
Questions and Comparing 
Tests 


We have described the process of hypothesis- 
directed sequential data collection and have 
asked how an observation might evoke or 
refine the physician’s hypotheses about what 
abnormalities account for the patient’s illness. 
The complementary question is: Given a set of 
current hypotheses, how does the physician 
decide what additional data should be collect- 
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ed? This question also has been analyzed at 
length (Elstein et al. 1978; Arocha et al. 2005) 
and is pertinent for computer programs that 
gather data efficiently to assist clinicians with 
diagnosis or with therapeutic decision making 
(see > Chap. 26). Because understanding issues 
of test selection and data interpretation is cru- 
cial to understanding medical data and their 
uses, we devote > Chap. 3 to these and related 
issues of medical decision making. In » Sect. 
3.6, for example, we discuss the use of decision- 
analytic techniques in deciding whether to treat 
a patient on the basis of available information 
or to perform additional diagnostic tests. 


2.7 The Computer and Collection 
of Medical Data 


Although this chapter has not directly discussed 
computer systems, the role of the computer in 
medical data storage, retrieval, and interpreta- 
tion should be clear. Much of the rest of this 
book deals with specific applications in which 
the computer’s primary role is data manage- 
ment. One question is pertinent to all such appli- 
cations: What are the best approaches for getting 
data into the computer in the first place? 

The need for data entry by physicians has 
posed a problem for medical-computing sys- 
tems since the earliest days of the field. Awkward 
or nonintuitive interactions at computing devic- 
es—particularly ones requiring keyboard typing 
or confusing movement through multiple dis- 
play screens by the physician—have perhaps 
done more to frustrate clinicians than have any 
other factor. 

A variety of approaches have been used to 
try to finesse this problem. One is to design sys- 
tems such that clerical staff can do essentially all 
the data entry and much of the data retrieval as 
well. Many clinical research systems (see 
> Chap. 29) have taken this approach. Physicians 
may be asked to fill out structured paper data- 
sheets, or such sheets may be filled out by data 
abstractors who review patient charts, but the 
actual entry of data into the database is done by 
paid transcriptionists. Other physicians have 
adopted “scribes” (staff whose role is to follow 
physicians in examination rooms and to enter 


data into the electronic health record) to reduce 
the data entry burden on physicians while they 
interact with patients. 

In some applications, data are entered auto- 
matically into the computer by the device that 
measures or collects them. For example, moni- 
tors in intensive care or coronary care units, 
pulmonary function or ECG machines, and 
measurement equipment in the clinical chemis- 
try laboratory can interface directly with a 
computer in which a database is stored. Certain 
data can be entered directly by patients; there 
are systems, for example, that take the patient’s 
history by presenting on a computer screen or 
tablet multiple-choice questions that follow a 
branching logic. The patient’s responses to the 
questions are used to generate electronic or 
hard copy reports for physicians and also may 
be stored directly in a computer database for 
subsequent use in other settings. 

When physicians or other health personnel 
do use the machine themselves, specialized devic- 
es often allow rapid and intuitive operator— 
machine interaction. Most of these devices use a 
variant of the “point-and-select” approach—e.g., 
touch-sensitive computer screens, mouse-point- 
ing devices, and increasingly the clinician’s finger 
on a mobile tablet or smart phone (see > Chaps. 
5 and 6). When conventional computer worksta- 
tions are used, specialized keypads can be help- 
ful. Designers frequently permit logical selection 
of items from menus displayed on the screen so 
that the user does not need to learn a set of spe- 
cialized commands to enter or review data. There 
were clear improvements when handheld tablets 
using pen-based or finger-based mechanisms for 
data entry were introduced. With ubiquitous 
wireless data services, such devices are allowing 
clinicians to maintain normal mobility (in and 
out of examining rooms or inpatient rooms) 
while accessing and entering data that are perti- 
nent to a patient’s care. 

These issues arise in essentially all applica- 
tion areas, and, because they can be crucial to 
the successful implementation and use of a 
system, they warrant particular attention in 
system design. As more physicians are com- 
fortable with computers in daily life, they will 
likely find the use of computers in their prac- 
tice less of a hindrance. We encourage you to 
consider human-computer interaction, and 
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the cognitive issues that arise in dealing with 
computer systems (see > Chap. 4), as you 
learn about the application areas and the spe- 
cific systems described in later chapters. 
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matics. This includes work applying findings 
from national genome sequencing initiatives 
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that bridge research with clinical care. See also 
Chap. 28. 


Q Questions for Discussion 


1. You check your pulse and discover that 
your heart rate is 100 beats per minute. Is 
this rate normal or abnormal? What 
additional information would you use in 
making this judgment? How does the 
context in which data are collected influ- 
ence the interpretation of those data? 

2. Given the imprecision of many medical 
terms, why do you think that serious 
instances of miscommunication among 
health care professionals are not more 
common? Why is greater standardization 
of terminology necessary if computers 
rather than humans are to manipulate 
patient data? 

3. Based on the discussion of coding schemes 
for representing clinical information, dis- 
cuss three challenges you foresee in 
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attempting to construct a standardized 
terminology to be used in hospitals, physi- 
cians’ offices, and research institutions. 
How would medical practice change if 
nonphysicians were to collect and enter 
all medical data into EHRs? What prob- 
lems or unintended consequentes would 
you anticipate? 

Consider what you know about the typical 
daily schedule of a busy clinician. What 
are the advantages of wireless devices, 
connected to the Internet, as tools for such 
clinicians? Can you think of disadvantag- 
es as well? Be sure to consider the safety 
and protection of information as well as 
workflow and clinical needs. 

To decide whether a patient has a signifi- 
cant urinary tract infection, physicians 
commonly use a calculation of the num- 
ber of bacterial organisms in a milliliter 
of the patient’s urine. Physicians gener- 
ally assume that a patient has a urinary 
tract infection if there are at least 10,000 
bacteria per milliliter. Although labora- 
tories can provide such quantification 
with reasonable accuracy, it is obviously 
unrealistic for the physician explicitly to 
count large numbers of bacteria by 
examining a milliliter of urine under the 
microscope. Asa result, one article offers 
the following guideline to physicians: 
“When interpreting ... microscopy of ... 
stained centrifuged urine, a threshold of 
one organism per field yields a 95% sen- 
sitivity and five organisms per field a 
95% specificity for bacteriuria [bacteria 
in the urine] at a level of at least 10,000 
organisms per ml.” (Senior Medical 
Review 1987, p. 4) 


(a) Describe an experiment that would 
have allowed the researchers to 
determine the sensitivity and speci- 
ficity of the microscopy. 

(b) How would you expect specificity to 
change as the number of bacteria 
per microscopic field increases from 
one to five? 

(c) How would you expect sensitivity to 
change as the number of bacteria 


per microscopic field increases from 
one to five? 

(d) Why does it take more organisms 
per microscopic field to obtain a 
specificity of 95% than it does to 
achieve a sensitivity of 95%? 
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(e) Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= How is the concept of probability useful 
for understanding test results and for 
making medical decisions that involve 
uncertainty? 

= How can we characterize the ability of a 
test to discriminate between disease and 
health? 

= What information do we need to inter- 
pret test results accurately? 

= What is expected-value decision making? 
How can this methodology help us to un- 
derstand particular medical problems? 

= What are utilities, and how can we use 
them to represent patients’ preferences? 

= What is a sensitivity analysis? How can 
we use it to examine the robustness of a 
decision and to identify the important 
variables in a decision? 

= What are influence diagrams? How do 
they differ from decision trees? 


3.1 The Nature of Clinical 
Decisions: Uncertainty and the 
Process of Diagnosis 


Because clinical data are imperfect and out- 
comes of treatment are uncertain, health profes- 
sionals often are faced with difficult choices. In 
this chapter, we introduce probabilistic medical 
reasoning, an approach that can help health care 
providers to deal with the uncertainty inherent 
in many medical decisions. Medical decisions 
are made by a variety of methods; our approach 
is neither necessary nor appropriate for all deci- 
sions. Throughout the chapter, we provide sim- 
ple clinical examples that illustrate a broad range 
of problems for which probabilistic medical rea- 
soning does provide valuable insight. 

As discussed in > Chap. 2, medical practice 
is medical decision making. In this chapter, we 
look at the process of medical decision mak- 
ing. Together, > Chaps. 2 and 3 lay the ground- 
work for the rest of the book. In the remaining 
chapters, we discuss ways that computers can 
help clinicians with the decision-making pro- 
cess, and we emphasize the relationship be- 
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tween information needs and system design 
and implementation. 

The material in this chapter is presented in 
the context of the decisions made by an indi- 
vidual clinician. The concepts, however, are 
more broadly applicable. Sensitivity and spec- 
ificity are important parameters of laboratory 
systems that flag abnormal test results, of pa- 
tient monitoring systems (® Chap. 21), and of 
information-retrieval systems (> Chap. 23). 
An understanding of what probability is and 
of how to adjust probabilities after the acqui- 
sition of new information is a foundation for 
our study of clinical decision-support systems 
(> Chap. 24). The importance of probability 
in medical decision making was noted as long 
ago as 1922: 


» [G]ood medicine does not consist in the indis- 
criminate application of laboratory examina- 
tions to a patient, but rather in having so clear 
a comprehension of the probabilities and pos- 
sibilities of a case as to know what tests may be 
expected to give information of value (Peabody 
1922). 


» Example 3.1 


You are the director of a blood bank. All poten- 
tial blood donors are tested to ensure that they 
are not infected with the human immunodeficien- 
cy virus (HIV), the causative agent of acquired 
immunodeficiency syndrome (AIDS). You ask 
whether use of the polymerase chain reaction 
(PCR), a gene-amplification technique that can 
diagnose HIV, would be useful to identify people 
who have HIV. The PCR test is positive 98% of 
the time when antibody is present, and negative 
99% of the time antibody is absent.! «a 


If the test is positive, what is the likelihood that 
a donor actually has HIV? If the test is negative, 
how sure can you be that the person does not 
have HIV? On an intuitive level, these questions 


1 The test sensitivity and specificity used in 
> Example 3.1 are consistent with the reported 
values of the sensitivity and specificity of the PCR 
test for diagnosis of HIV early in its development 
(Owens et al. 1996b); the test now has higher 
sensitivity and specificity. 
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do not seem particularly difficult to answer. The 
test appears accurate, and we would expect that, 
if the test is positive, the donated blood speci- 
men is likely to contain the HIV. Thus, we are 
surprised to find that, if only one in 1000 donors 
actually is infected, the test is more often mis- 
taken than it is correct. In fact, of 100 donors 
with a positive test, fewer than 10 would be in- 
fected. There would be ten wrong answers for 
each correct result. How are we to understand 
this result? Before we try to find an answer, let us 
consider a related example. 


» Example 3.2 


Ms. Kamala is a 66-year-old woman with coro- 
nary artery disease (narrowing or blockage of the 
blood vessels that supply the heart tissue). When 
the heart muscle does not receive enough oxygen 
(hypoxia) because blood cannot reach it, the pa- 
tient often experiences chest pain (angina). Ms. 
Kamala has twice undergone coronary artery by- 
pass graft (CABG) surgery, a procedure in which 
new vessels, often taken from the leg, are grafted 
onto the old ones such that blood is shunted 
past the blocked region. Unfortunately, she has 
again begun to have chest pain, which becomes 
progressively more severe, despite medication. If 
the heart muscle is deprived of oxygen, the result 
can be a heart attack (myocardial infarction), in 
which a section of the muscle dies. < 


Should Ms. Kamala undergo a third operation? 
The medications are not working; without sur- 
gery, she runs a high risk of suffering a heart at- 
tack, which may be fatal. On the other hand, the 
surgery is hazardous. Not only is the surgical 
mortality rate for a third operation higher than 
that for a first or second one but also the chance 
that surgery will relieve the chest pain is lower 
than that for a first operation. All choices in > 
Example 3.2 entail considerable uncertainty. 
Furthermore, the risks are grave; an incorrect 
decision may substantially increase the chance 
that Ms. Kamala will die. The decision will be 
difficult even for experienced clinicians. 

These examples illustrate situations in which 
intuition is either misleading or inadequate. 
Although the test results in > Example 3.1 are 
appropriate for the blood bank, a clinician 
who uncritically reports these results would er- 
roneously inform many people that they had 


HIV—a mistake with profound emotional and 
social consequences. In ® Example 3.2, the de- 
cision-making skill of the clinician will affect a 
patient’s quality and length of life. Similar situ- 
ations are commonplace in medicine. Our goal 
in this chapter is to show how the use of prob- 
ability and decision analysis can help to make 
clear the best course of action. 

Decision making is one of the quintessen- 
tial activities of the healthcare professional. 
Some decisions are made on the basis of de- 
ductive reasoning or of physiological princi- 
ples. Many decisions, however, are made on the 
basis of knowledge that has been gained 
through collective experience: the clinician of- 
ten must rely on empirical knowledge of asso- 
ciations between symptoms and disease to 
evaluate a problem. A decision that is based on 
these usually imperfect associations will be, to 
some degree, uncertain. In » Sects. 3.1.1, 3.1.2 
and 3.1.3, we examine decisions made under 
uncertainty and present an overview of the di- 
agnostic process. As Smith (1985, p. 3) said: 
“Medical decisions based on probabilities are 
necessary but also perilous. Even the most as- 
tute physician will occasionally be wrong.” 


3.1.1 Decision Making Under 
Uncertainty 


» Example 3.3 


Ms. Kirk, a 33-year-old woman with a history 
of a previous blood clot (thrombus) in a vein in 
her left leg, presents with the complaint of pain 
and swelling in that leg for the past 5 days. On 
physical examination, the leg is tender and swol- 
len to midcalf—signs that suggest the possibility 
of deep vein thrombosis.? A test (ultrasonogra- 
phy) is performed, and the flow of blood in the 
veins of Ms. Kirk’s leg is evaluated. The blood 
flow is abnormal, but the radiologist cannot tell 
whether there is a new blood clot. < 


2 In medicine, a sign is an objective physical finding 
(something observed by the clinician) such as a tem- 
perature of 101.2 °F. A symptom is a subjective 
experience of the patient, such as feeling hot or 
feverish. The distinction may be blurred if the 
patient’s experience also can be observed by the cli- 
nician. 
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Should Ms. Kirk be treated for blood clots? 
The main diagnostic concern is the recurrence 
of a blood clot in her leg. A clot in the veins of 
the leg can dislodge, flow with the blood, and 
cause a blockage in the vessels of the lungs, a 
potentially fatal event called a pulmonary em- 
bolus. Of patients with a swollen leg, about 
one-half actually have a blood clot; there are 
numerous other causes of a swollen leg. Given 
a swollen leg, therefore, a clinician cannot be 
sure that a clot is the cause. Thus, the physical 
findings leave considerable uncertainty. Fur- 
thermore, in > Example 3.3, the results of the 
available diagnostic test are equivocal. The 
treatment for a blood clot is to administer anti- 
coagulants (drugs that inhibit blood clot for- 
mation), which pose the risk of excessive 
bleeding to the patient. Therefore, clinicians do 
not want to treat the patient unless they are 
confident that a thrombus is present. But how 
much confidence should be required before 
starting treatment? We will learn that it is pos- 
sible to answer this question by calculating the 
benefits and harms of treatment. 

This example illustrates an important con- 
cept: Clinical data are imperfect. The degree 
of imperfection varies, but all clinical data— 
including the results of diagnostic tests, the 
history given by the patient, and the findings 
on physical examination—are uncertain. 


3.1.2 Probability: An Alternative 
Method of Expressing 
Uncertainty 


The language that clinicians use to describe a 
patient’s condition often is ambiguous—a fac- 
tor that further complicates the problem of un- 
certainty in medical decision making. Clinicians 
use words such as “probable” and “highly like- 
ly” to describe their beliefs about the likelihood 
of disease. These words have strikingly differ- 
ent meanings to different individuals. Because 
of the widespread disagreement about the 
meaning of common descriptive terms, there is 
ample opportunity for miscommunication. 
The problem of how to express degrees of 
uncertainty is not unique to medicine. How is 
it handled in other contexts? Horse racing has 
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its share of uncertainty. If experienced gam- 
blers are deciding whether to place bets, they 
will find it unsatisfactory to be told that a 
given horse has a “high chance” of winning. 
They will demand to know the odds. 

The odds are simply an alternate way to 
express a probability. The use of probability 
or odds as an expression of uncertainty avoids 
the ambiguities inherent in common descrip- 
tive terms. 


3.1.3 Overview of the Diagnostic 
Process 


In » Chap. 2, we described the hypothetico- 
deductive approach, a diagnostic strategy com- 
prising successive iterations of hypothesis 
generation, data collection, and interpretation. 
We discussed how observations may evoke a 
hypothesis and how new information subse- 
quently may increase or decrease our belief in 
that hypothesis. Here, we review this process 
briefly in light of a specific example. For the 
purpose of our discussion, we separate the di- 
agnostic process into three stages. 

The first stage involves making an initial 
judgment about whether a patient is likely to 
have a disease. After an interview and physical 
examination, a clinician intuitively develops a 
belief about the likelihood of disease. This judg- 
ment may be based on previous experience or on 
knowledge of the medical literature. A clini- 
cian’s belief about the likelihood of disease usu- 
ally is implicit; he or she can refine it by making 
an explicit estimation of the probability of dis- 
ease. This estimated probability, made before 
further information is obtained, is the prior 
probability or pretest probability of disease. 


» Example 3.4 


Mr. Smith, a 60-year-old man, complains to 
his clinician that he has pressure-like chest 
pain that occurs when he walks quickly. After 
taking his history and examining him, his cli- 
nician believes there is a high enough chance 
that he has heart disease to warrant ordering 
an exercise stress test. In the stress test, an elec- 
trocardiogram (ECG) is taken while Mr. Smith 
exercises. Because the heart must pump more 
blood per stroke and must beat faster (and thus 
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requires more oxygen) during exercise, many 
heart conditions are evident only when the pa- 
tient is physically stressed. Mr. Smith’s results 
show abnormal changes in the ECG during 
exercise—a sign of heart disease. <a 


How would the clinician evaluate this patient? 
The clinician would first talk to the patient about 
the quality, duration, and severity of his or her 
pain. Traditionally, the clinician would then de- 
cide what to do next based on his or her intuition 
about the etiology (cause) of the chest pain. Our 
approach is to ask the clinician to make his or 
her initial intuition explicit by estimating the 
pretest probability of disease. The clinician in 
this example, based on what he or she knows 
from talking with the patient, might assess the 
pretest or prior probability of heart disease as 
0.5 (50% chance or 1:1 odds; see » Sect. 3.2). We 


explore methods used to estimate pretest proba- 
bility accurately in » Sect. 3.2. 

After the pretest probability of disease has 
been estimated, the second stage of the diagnos- 
tic process involves gathering more information, 
often by performing a diagnostic test. The clini- 
cian in ® Example 3.4 ordered a test to reduce 
the uncertainty about the diagnosis of heart dis- 
ease. The positive test result supports the diag- 
nosis of heart disease, and this reduction in 
uncertainty is shown in @ Fig. 3.la. Although 
the clinician in » Example 3.4 chose the exercise 
stress test, there are many tests available to diag- 
nose heart disease, and the clinician would like 
to know which test he or she should order next. 
Some tests reduce uncertainty more than do 
others (see B Fig. 3.1b), but may cost more. The 
more a test reduces uncertainty, the more useful 
it is. In > Sect. 3.3, we explore ways to measure 


a 
Pretest Post-test 
probability probability 
Perform test > 
| y | 
| | 
0.0 0.5 1.0 
Probability of disease 
b 
Pretest Post-test probability 
probability after test 2 
Perform tet2 [>| 
Post-test probability 
after test 1 
Perform test 1 l 
~ n 
0.0 Probability of disease 1.0 


O Fig. 3.1 The effect of test results on the probability 
of disease. a A positive test result increases the probabil- 
ity of disease. b Test 2 reduces uncertainty about pres- 


ence of disease (increases the probability of disease) 
more than test 1 does 
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how well a test reduces uncertainty, expanding 
the concepts of test sensitivity and specificity 
first introduced in » Chap. 2. 

Given new information provided by a test, 
the third step is to update the initial probability 
estimate. The clinician in » Example 3.4 must 
ask: “What is the probability of disease given 
the abnormal stress test?” The clinician wants to 
know the posterior probability, or post-test prob- 
ability, of disease (see Ø Fig. 3.1a). In » Sect. 
3.4, we reexamine Bayes’ theorem, introduced 
in > Chap. 2, and we discuss its use for calculat- 
ing the post-test probability of disease. As we 
noted, to calculate post-test probability, we 
must know the pretest probability, as well as the 
sensitivity and specificity, of the test. 


3.2 Probability Assessment: 
Methods to Assess Pretest 
Probability 


In this section, we explore the methods that 
clinicians can use to make judgments about 
the probability of disease before they order 
tests. Probability is our preferred means of 
expressing uncertainty. In this framework, 
probability (p) expresses a clinician’s opin- 
ion about the likelihood of an event as a 
number between 0 and 1. An event that is 
certain to occur has a probability of 1; an 
event that is certain not to occur has a prob- 
ability of 0.4 

The probability of event A is written p[A]. 
The sum of the probabilities of all possible, 
collectively exhaustive outcomes of a chance 
event must be equal to 1. Thus, in a coin flip, 


p[heads] + pltails] =1.0. 


3 Note that pretest and post-test probabilities corre- 
spond to the concepts of prevalence and predictive 
value. The latter terms were used in > Chap. 2 
because the discussion was about the use of tests for 
screening populations of patients; in a population, 
the pretest probability of disease is simply that dis- 
ease’s prevalence in that population. 

4 Weassume a Bayesian interpretation of probability; 
there are other statistical interpretations of proba- 
bility. 
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The probability of event A and event B occur- 
ring together is denoted by p[ A&B] or by p[A,B]. 

Events A and B are considered independent 
if the occurrence of one does not influence the 
probability of the occurrence of the other. 
The probability of two independent events A 
and B both occurring is given by the product 
of the individual probabilities: 


p[A,B] = p[A]xp[B]. 


Thus, the probability of heads on two consecu- 
tive coin tosses is 0.5 X 0.5 = 0.25. (Regardless 
of the outcome of the first toss, the probability 
of heads on the second toss is 0.5). 

The probability that event A will occur 
given that event B is known to occur is called 
the conditional probability of event A given 
event B, denoted by p[A|B] and read as “the 
probability of A given B.” Thus a post-test 
probability is a conditional probability predi- 
cated on the test or finding. For example, if 
30% of patients who have a swollen leg have a 
blood clot, we say the probability of a blood 
clot given a swollen leg is 0.3, denoted: 


p[blood clot|swollen leg] = 0.3. 


Before the swollen leg is noted, the pretest 
probability is simply the prevalence of blood 
clots in the leg in the population from which 
the patient was selected—a number likely to 
be much smaller than 0.3. 

Now that we have decided to use probabili- 
ty to express uncertainty, how can we estimate 
probability? We can do so by either subjective 
or objective methods; each approach has ad- 
vantages and limitations. 


3.2.1 Subjective Probability 
Assessment 


Most assessments that clinicians make about 
probability are based on personal experience. 
The clinician may compare the current prob- 
lem to similar problems encountered previous- 
ly and then ask: “What was the frequency of 
disease in similar patients whom I have seen?” 

To make these subjective assessments of 
probability, people rely on several discrete, often 
unconscious mental processes that have been de- 
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scribed and studied by cognitive psychologists 

(Tversky and Kahneman 1974). These processes 

are termed cognitive heuristics. 

More specifically, a cognitive heuristic is a 
mental process by which we learn, recall, or pro- 
cess information; we can think of heuristics as 
rules of thumb. Knowledge of heuristics is im- 
portant because it helps us to understand the 
underpinnings of our intuitive probability as- 
sessment. Both naive and sophisticated decision 
makers (including clinicians and statisticians) 
misuse heuristics and therefore make system- 
atic—often serious—errors when estimating 
probability. So, just as we may underestimate 
distances on a particularly clear day (Tversky 
and Kahneman 1974), we may make mistakes in 
estimating probability in deceptive clinical situa- 
tions. Three heuristics have been identified as 
important in estimation of probability: 

1. Representativeness. One way that people es- 
timate probability is to ask themselves: What 
is the probability that object A belongs to 
class B? For instance, what is the probability 
that this patient who has a swollen leg be- 
longs to the class of patients who have blood 
clots? To answer, we often rely on the repre- 
sentativeness heuristic in which probabilities 
are judged by the degree to which A is repre- 
sentative of, or similar to, B. The clinician 
will judge the probability of the development 
of a blood clot (thrombosis) by the degree to 
which the patient with a swollen leg resem- 
bles the clinician’s mental image of patients 
with a blood clot. If the patient has all the 
classic findings (signs and symptoms) associ- 
ated with a blood clot, the clinician judges 
that the patient is highly likely to have a 
blood clot. Difficulties occur with the use of 
this heuristic when the disease is rare (very 
low prior probability, or prevalence); when 
the clinician’s previous experience with the 
disease is atypical, thus giving an incorrect 
mental representation; when the patient’s 
clinical profile is atypical; and when the 
probability of certain findings depends on 
whether other findings are present. 

2. Availability. Our estimate of the probabil- 
ity of an event is influenced by the ease 
with which we remember similar events. 
Events more easily remembered are judged 
more probable; this rule is the availability 


heuristic, and it is often misleading. We re- 
member dramatic, atypical, or emotion- 
laden events more easily and therefore are 
likely to overestimate their probability. A 
clinician who had cared for a patient who 
had a swollen leg and who then died from 
a blood clot would vividly remember 
thrombosis as a cause of a swollen leg. The 
clinician would remember other causes of 
swollen legs less easily, and he or she would 
tend to overestimate the probability of a 
blood clot in patients with a swollen leg. 

3. Anchoring and adjustment. Another com- 
mon heuristic used to judge probability is 
anchoring and adjustment. A clinician 
makes an initial probability estimate (the 
anchor) and then adjusts the estimate based 
on further information. For instance, the 
clinician in > Example 3.4 makes an initial 
estimate of the probability of heart disease 
as 0.5. If he or she then learns that all the 
patient’s brothers had died of heart disease, 
the clinician should raise the estimate be- 
cause the patient’s strong family history of 
heart disease increases the probability that 
he or she has heart disease, a fact the clini- 
cian could ascertain from the literature. The 
usual mistake is to adjust the initial esti- 
mate (the anchor) insufficiently in light of 
the new information. Instead of raising his 
or her estimate of prior probability to, say, 
0.8, the clinician might adjust it to only 0.6. 


Heuristics often introduce error into our judg- 
ments about prior probability. Errors in our 
initial estimates of probabilities will be reflect- 
ed in the posterior probabilities even if we use 
quantitative methods to derive those posterior 
probabilities. An understanding of heuristics 
is thus important for medical decision mak- 
ing. The clinician can avoid some of these dif- 
ficulties by using published research results to 
estimate probabilities. 


3.2.2 Objective Probability 
Estimates 


Published research results can serve as a guide 
for more objective estimates of probabilities. We 
can use the prevalence of disease in the popula- 
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tion or in a subgroup of the population, or clin- 
ical prediction rules, to estimate the probability 
of disease. 

As we discussed in » Chap. 2, the preva- 
lence is the frequency of an event in a popula- 
tion; it is a useful starting point for estimating 
probability. For example, if you wanted to es- 
timate the probability of prostate cancer in a 
50-year-old man, the prevalence of prostate 
cancer in men of that age (5-14%) would be a 
useful anchor point from which you could in- 
crease or decrease the probability depending 
on your findings. Estimates of disease preva- 
lence in a defined population often are avail- 
able in the medical literature. 

Symptoms, such as difficulty with urina- 
tion, or signs, such as a palpable prostate nod- 
ule, can be used to place patients into a clinical 
subgroup in which the probability of disease is 
known. For patients referred to a urologist for 
evaluation of a prostate nodule, the preva- 
lence of cancer is about 50%. This approach 
may be limited by difficulty in placing a pa- 
tient in the correct clinically defined subgroup, 
especially if the criteria for classifying patients 
are ill-defined. A trend has been to develop 
guidelines, known as clinical prediction rules, 
to help clinicians assign patients to well-de- 
fined subgroups in which the probability of 
disease is known. 

Clinical prediction rules are developed from 
systematic study of patients who have a particu- 
lar diagnostic problem; they define how clini- 
cians can use combinations of clinical findings 
to estimate probability. The symptoms or signs 
that make an independent contribution to the 
probability that a patient has a disease are iden- 
tified and assigned numerical weights based on 
statistical analysis of the finding’s contribution. 
The result is a list of symptoms and signs for an 
individual patient, each with a corresponding 
numerical contribution to a total score. The to- 
tal score places a patient in a subgroup with a 
known probability of disease. 


» Example 3.5 


Ms. Troy, a 65-year-old woman who had a heart 
attack 4 months ago, has abnormal heart rhythm 
(arrhythmia), is in poor medical condition, and 
is about to undergo elective surgery. € 
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What is the probability that Ms. Troy will 
suffer a cardiac complication? Clinical pre- 
diction rules have been developed to help cli- 
nicians to assess this risk (Palda and Detsky 
1997). @Table 3.1 lists clinical findings and 
their corresponding diagnostic weights. We 
add the diagnostic weights for each of the 
patient’s clinical findings to obtain the total 
score. The total score places the patient in a 
group with a defined probability of cardiac 
complications, as shown in @Table 3.2. Ms. 
Troy receives a score of 20; thus, the clinician 
can estimate that the patient has a 27% 
chance of developing a severe cardiac com- 
plication. 

Objective estimates of pretest probability 
are subject to error because of bias in the 
studies on which the estimates are based. For 


O Table 3.1 Diagnostic weights for assessing 

risk of cardiac complications from noncardiac 

surgery 

Clinical finding Diagnostic 
weight 

Age greater than 70 years 5 

Recent documented heart 

attack 

>6 months previously 5 

<6 months previously 10 

Severe angina 20 

Pulmonary edema? 

Within 1 week 10 

Ever 5 

Arrhythmia on most recent 

ECG 5 

>5 PVCs 5 

Critical aortic stenosis 20 

Poor medical condition 5 

Emergency surgery 10 


ECG electrocardiogram, PVCs premature ven- 
tricular contractions on preoperative electrocar- 
diogram 

Fluid in the lungs due to reduced heart function 
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O Table 3.2 Clinical prediction rule for 
diagnostic weights in @ Table 3.1 


Total score Prevalence (%) of cardiac 


complications? 
0-15 5 
20-30 27 
>30 60 


4Cardiac complications defined as death, heart 
attack, or congestive heart failure 


instance, published prevalence data may not 
apply directly to a particular patient. A clini- 
cal illustration is that early studies indicated 
that a patient found to have microscopic evi- 
dence of blood in the urine (microhematuria) 
should undergo extensive tests because a sig- 
nificant proportion of the patients would be 
found to have cancer or other serious diseases. 
The tests involve some risk, discomfort, and 
expense to the patient. Nonetheless, the ap- 
proach of ordering tests for any patient with 
microhematuria was widely practiced for 
some years. A later study, however, suggested 
that the probability of serious disease in as- 
ymptomatic patients with only microscopic 
evidence of blood was only about 2%. In the 
past, many patients may have undergone un- 
necessary tests, at considerable financial and 
personal cost. 

What explains the discrepancy in the esti- 
mates of disease prevalence? The initial studies 
that showed a high prevalence of disease in pa- 
tients with microhematuria were performed on 
patients referred to urologists, who are special- 
ists. The primary care clinician refers patients 
whom he or she suspects have a disease in the 
specialist’s sphere of expertise. Because of this 
initial screening by primary care clinicians, the 
specialists seldom see patients with clinical 
findings that imply a low probability of dis- 
ease. Thus, the prevalence of disease in the pa- 
tient population in a specialist’s practice often 
is much higher than that in a primary care 
practice; studies performed with the former pa- 
tients therefore almost always overestimate dis- 
ease probabilities. This example demonstrates 


referral bias. Referral bias is common because 
many published studies are performed on pa- 
tients referred to specialists. Thus, one may 
need to adjust published estimates before one 
uses them to estimate pretest probability in 
other clinical settings. 

We now can use the techniques discussed in 
this part of the chapter to illustrate how the cli- 
nician in > Example 3.4 might estimate the pre- 
test probability of heart disease in his or her 
patient, Mr. Smith, who has pressure-like chest 
pain. We begin by using the objective data that 
are available. The prevalence of heart disease in 
60-year-old men could be our starting point. In 
this case, however, we can obtain a more re- 
fined estimate by placing the patient in a clini- 
cal subgroup in which the prevalence of disease 
is known. The prevalence in a clinical sub- 
group, such as men with symptoms typical of 
coronary heart disease, will predict the pretest 
probability more accurately than would the 
prevalence of heart disease in a group that is 
heterogeneous with respect to symptoms, such 
as the population at large. We assume that large 
studies have shown the prevalence of coronary 
heart disease in men with typical symptoms of 
angina pectoris to be about 0.9; this prevalence 
is useful as an initial estimate that can be ad- 
justed based on information specific to the pa- 
tient. Although the prevalence of heart disease 
in men with typical symptoms is high, 10% of 
patients with this history do not have heart dis- 
ease. 

The clinician might use subjective meth- 
ods to adjust his or her estimate further based 
on other specific information about the pa- 
tient. For example, the clinician might adjust 
his or her initial estimate of 0.9 upward to 
0.95 or higher based on information about 
family history of heart disease. The clinician 
should be careful, however, to avoid the mis- 
takes that can occur when one uses heuristics 
to make subjective probability estimates. In 
particular, he or she should be aware of the 
tendency to stay too close to the initial esti- 
mate when adjusting for additional informa- 
tion. By combining subjective and objective 
methods for assessing pretest probability, the 
clinician can arrive at a reasonable estimate of 
the pretest probability of heart disease. 
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Healthy 
population 


Number of 
individuals 


O Fig. 3.2 Distribution of test results in healthy and 
diseased individuals. Varying the cutoff between “nor- 
mal” and “abnormal” across the continuous range of 


In this section, we summarized subjective 
and objective methods to determine the pre- 
test probability, and we learned how to adjust 
the pretest probability after assessing the spe- 
cific subpopulation of which the patient is 
representative. The next step in the diagnostic 
process is to gather further information, usu- 
ally in the form of formal diagnostic tests 
(laboratory tests, X-ray studies, etc.). To help 
you to understand this step more clearly, we 
discuss in the next two sections how to mea- 
sure the accuracy of tests and how to use 
probability to interpret the results of the tests. 


3.3 Measurement of the Operating 
Characteristics of Diagnostic 
Tests 


The first challenge in assessing any test is to 
determine criteria for deciding whether a re- 
sult is normal or abnormal. In this section, we 
present the issues that you need to consider 
when making such a determination. 


3.3.1 Classification of Test Results 
as Abnormal 


Most biological measurements in a popula- 
tion of healthy people are continuous vari- 
ables that assume different values for different 


Normal — 
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False positives 
MM] False negatives 


— Abnormal ——-> 


Diseased 
population 


Test result 


possible values changes the relative proportions of false 
positives (FPs) and false negatives (FNs) for the two 
populations 


individuals. The distribution of values often is 
approximated by the normal (gaussian, or 
bell-shaped) distribution curve (B Fig. 3.2). 
Thus, 95% of the population will fall within 
two standard deviations of the mean. About 
2.5% of the population will be more than two 
standard deviations from the mean at each 
end of the distribution. The distribution of 
values for ill individuals may be normally dis- 
tributed as well. The two distributions usually 
overlap (see @Fig. 3.2). 

How is a test result classified as abnormal? 
Most clinical laboratories report an “upper 
limit of normal,” which usually is defined as 
two standard deviations above the mean. 
Thus, a test result greater than two standard 
deviations above the mean is reported as ab- 
normal (or positive); a test result below that 
cutoff is reported as normal (or negative). As 
an example, if the mean cholesterol concen- 
tration in the blood is 180 mg/dl, a clinical 
laboratory might choose as the upper limit of 
normal 220 mg/dl because it is two standard 
deviations above the mean. Note that a cutoff 
that is based on an arbitrary statistical crite- 
rion may not have biological significance. 

An ideal test would have no values at 
which the distribution of diseased and non- 
diseased people overlap. That is, if the cutoff 
value were set appropriately, the test would be 
normal in all healthy individuals and abnor- 
mal in all individuals with disease. Few tests 
meet this standard. If a test result is defined as 
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abnormal by the statistical criterion, 2.5% of 

healthy individuals will have an abnormal 

test. If there is an overlap in the distribution 

of test results in healthy and diseased individ- 

uals, some diseased patients will have a nor- 

mal test (see B Fig. 3.2). You should be 

familiar with the terms used to denote these 

groups: 

= A true positive (TP) is a positive test result 
obtained for a patient in whom the disease 
is present (the test result correctly classifies 
the patient as having the disease). 

= A true negative (TN) is a negative test re- 
sult obtained for a patient in whom the 
disease is absent (the test result correctly 
classifies the patient as not having the dis- 
ease). 

= A false positive (FP) is a positive test result 
obtained for a patient in whom the disease 
is absent (the test result incorrectly classi- 
fies the patient as having the disease). 

= A false negative (FN) is a negative test re- 
sult obtained for a patient in whom the 
disease is present (the test result incorrect- 
ly classifies the patient as not having the 
disease). 


GFigure 3.2 shows that varying the cutoff 
point (moving the vertical line in the figure) 
for an abnormal test will change the relative 
proportions of these groups. As the cutoff is 
moved further up from the mean of the nor- 
mal values, the number of FNs increases and 
the number of FPs decreases. Once we have 
chosen a cutoff point, we can conveniently 
summarize test performance—the ability to 
discriminate disease from nondisease—in a 2 
x 2 contingency table, as shown in @ Table 
3.3. The table summarizes the number of pa- 
tients in each group: TP, FP, TN, and FN. 
Note that the sum of the first column is the 
total number of diseased patients, TP + FN. 
The sum of the second column is the total 
number of nondiseased patients, FP + TN. 
The sum of the first row, TP + FP, is the total 
number of patients with a positive test result. 
Likewise, FN + TN gives the total number of 
patients with a negative test result. 


A perfect test would have no FN or FP re- 
sults. Erroneous test results do occur, howev- 
er, and you can use a 2 X 2 contingency table 
to define the measures of test performance 
that reflect these errors. 


3.3.2 Measures of Test Performance 


Measures of test performance are of two 
types: measures of agreement between tests or 
measures of concordance, and measures of 
disagreement or measures of discordance. 
Two types of concordant test results occur in 
the 2 x 2 table in @Table 3.3: TPs and TNs. 
The relative frequencies of these results form 
the basis of the measures of concordance. 
These measures correspond to the ideas of the 
sensitivity and specificity of a test, which we 
introduced in » Chap. 2. We define each mea- 
sure in terms of the 2 x 2 table and in terms of 
conditional probabilities. 

The true-positive rate (TPR), or sensitivi- 
ty, is the likelihood that a diseased patient has 
a positive test. In conditional-probability no- 
tation, sensitivity is expressed as the probabil- 
ity of a positive test given that disease is 
present: 


p[positive test|disease]. 


Another way to think of the TPR is as a ratio. 
The likelihood that a diseased patient has a 
positive test is given by the ratio of diseased 


O Table 3.3 A 2x 2 contingency table for test 
results 


Results of | Disease Disease Total 
test present absent 
Positive RB FP THR sj TP 
result 
Negative FN TN FN+TN 
result 

TP+FN FP+TN 


TP true positive, TN true negative, FP false posi- 
tive, FN false negative 
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patients with a positive test to all diseased pa- 
tients: 


number of diseased patients 
with positive test 

TPR = UO PoE e o i 
total number of diseased patients 


We can determine these numbers for our ex- 
ample from the 2 x 2 table (see Q Table 3.3). 
The number of diseased patients with a posi- 
tive test is TP. The total number of diseased 
patients is the sum of the first column, TP + 
FN. So, 


TP 


TPR = ——.. 
TP+FN 


The true-negative rate (TNR), or specificity, is 
the likelihood that anondiseased patient hasa 
negative test result. In terms of conditional 
probability, specificity is the probability of a 
negative test given that disease is absent: 


p[negative test|no disease]. 


Viewed as a ratio, the TNR is the number of 
nondiseased patients with a negative test di- 
vided by the total number of nondiseased pa- 
tients: 


Number of nondiseased patients 


TNR = with negative test 


Total number of nondiseased 
patients 


From the 2 x 2 table (see @Table 3.3), 


TN 


TNR = —— 
TN+FP 


The measures of discordance—the false- 
positive rate (FPR) and the false-negative rate 
(FNR)—are defined similarly. The FNR is the 
likelihood that a diseased patient has a nega- 
tive test result. As a ratio, 


Number of diseased patients 


FNR = with negative test 


Total number of diseased patients 


_ FN 
FN+TP' 
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O Table3.4 A 2x 2 contingency table for HIV 
antibody EIA 


EIA test result Antibody Antibody Total 
present absent 

Positive EIA 98 3 101 

Negative EIA 2 297 299 
100 300 


EIA enzyme-linked immunoassay 


The FPR is the likelihood that a nondiseased 
patient has a positive test result: 


Number of nondiseased patients 
with positive test 


FPR = 5 ; 
Total number of nondiseased patients 


oO F 
FP+TN 


pm Example 3.6 


Consider again the problem of screening blood 
donors for HIV. One test used to screen blood 
donors for HIV antibody is an enzyme-linked 
immunoassay (EIA). So that the performance 
of the EIA can be measured, the test is per- 
formed on 400 patients; the hypothetical results 
are shown in the 2 x 2 table in @Table 3.4.5 < 


To determine test performance, we calculate 
the TPR (sensitivity) and TNR (specificity) of 
the EIA antibody test. The TPR, as defined 
previously, is: 


TP 98 


= = 0.98 
TP+FN 98+2 


Thus, the likelihood that a patient with the 
HIV antibody will have a positive EIA test is 


5 This example assumes that we have a perfect method 
(different from EIA) for determining the presence or 
absence of antibody. We discuss the idea of gold- 
standard tests in > Sect. 3.3.4. We have chosen the 
numbers in the example to simplify the calculations. 
In practice, the sensitivity and specificity of the HIV 
EIAs are greater than 99%. 
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0.98. If the test were performed on 100 pa- 
tients who truly had the antibody, we would 
expect the test to be positive in 98 of the pa- 
tients. Conversely, we would expect two of the 
patients to receive incorrect, negative results, 
for an FNR of 2%. (You should convince 
yourself that the sum of TPR and FNR by 
definition must be 1: TPR + FNR = 1). 
And the TNR is: 


IN 2.20 pes 
TN+FP 29743 


The likelihood that a patient who has no HIV 
antibody will have a negative test is 0.99. 
Therefore, if the EIA test were performed on 
100 individuals who had not been infected 
with HIV, it would be negative in 99 and in- 
correctly positive in 1. (Convince yourself that 
the sum of TNR and FPR also must be 1: 
TNR + FPR = 1). 


3.3.3 Implications of Sensitivity 
and Specificity: How to 
Choose Among Tests 


It may be clear to you already that the calcu- 
lated values of sensitivity and specificity for a 
continuous-valued test depend on the particu- 
lar cutoff value chosen to distinguish normal 
and abnormal results. In B Fig. 3.2, note that 
increasing the cutoff level (moving it to the 
right) would decrease significantly the number 
of FP tests but also would increase the num- 
ber of FN tests. Thus, the test would have be- 
come more specific but less sensitive. Similarly, 
a lower cutoff value would increase the FPs 
and decrease the FNs, thereby increasing sen- 
sitivity while decreasing specificity. Whenever 
a decision is made about what cutoff to use in 
calling a test abnormal, an inherent philo- 
sophic decision is being made about whether 
it is better to tolerate FNs (missed cases) or 
FPs (nondiseased people inappropriately clas- 
sified as diseased). The choice of cutoff de- 
pends on the disease in question and on the 
purpose of testing. If the disease is serious 
and if lifesaving therapy is available, we should 
try to minimize the number of FN results. On 
the other hand, if the disease in not serious 


and the therapy is dangerous, we should set 
the cutoff value to minimize FP results. 

We stress the point that sensitivity and spec- 
ificity are characteristics not of a test per se but 
rather of the test and a criterion for when to 
call that test abnormal. Varying the cutoff in @ 
Fig. 3.2 has no effect on the test itself (the way 
it is performed, or the specific values for any 
particular patient); instead, it trades off speci- 
ficity for sensitivity. Thus, the best way to char- 
acterize a test is by the range of values of 
sensitivity and specificity that it can take on 
over a range of possible cutoffs. The typical 
way to show this relationship is to plot the test’s 
sensitivity against 1 minus specificity (i.e., the 
TPR against the FPR), as the cutoff is varied 
and the two test characteristics are traded off 
against each other (@ Fig. 3.3). The resulting 
curve, known as a receiver-operating character- 
istic (ROC) curve, was originally described by 
researchers investigating methods of electro- 
magnetic-signal detection and was later ap- 
plied to the field of psychology (Peterson and 
Birdsall 1953; Swets 1973). Any given point 
along a ROC curve for a test corresponds to 
the test sensitivity and specificity for a given 
threshold of “abnormality.” Similar curves can 
be drawn for any test used to associate ob- 
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O Fig. 3.3 Receiver operating characteristic (ROC) 
curves for two hypothetical tests. Test B is more discrim- 
inative than test A because its curve is higher (e.g., the 
false-positive rate (FPR) for test B is lower than the 
FPR for test A at any value of true-positive rate (TPR)). 
However, the more discriminative test may not always be 
preferred in clinical practice (see text) 


Biomedical Decision Making: Probabilistic Clinical Reasoning 


served clinical data with specific diseases or dis- 
ease categories. 

Suppose a new test were introduced that 
competed with the current way of screening 
for the presence of a disease. For example, 
suppose a new radiologic procedure for as- 
sessing the presence or absence of pneumonia 
became available. This new test could be as- 
sessed for trade-offs in sensitivity and specific- 
ity, and an ROC curve could be drawn. As 
shown in @Fig. 3.3, a test has better discrimi- 
nating power than a competing test if its ROC 
curve lies above that of the other test. In other 
words, test B is more discriminating than test 
A when its specificity is greater than test A’s 
specificity for any level of sensitivity (and 
when its sensitivity is greater than test A’s sen- 
sitivity for any level of specificity). 

Understanding ROC curves is important 
in understanding test selection and data inter- 
pretation. Clinicians should not necessarily, 
however, always choose the test with the most 
discriminating ROC curve. Matters of cost, 
risk, discomfort, and delay also are important 
in the choice about what data to collect and 
what tests to perform. When you must choose 
among several available tests, you should se- 
lect the test that has the highest sensitivity and 
specificity, provided that other factors, such as 
cost and risk to the patient, are equal. The 
higher the sensitivity and specificity of a test, 
the more the results of that test will reduce 
uncertainty about probability of disease. 


3.3.4 Design of Studies of Test 
Performance 


In > Sect. 3.3.2, we discussed measures of test 
performance: a test’s ability to discriminate dis- 
ease from no disease. When we classify a test 
result as TP, TN, FP, or FN, we assume that we 
know with certainty whether a patient is dis- 
eased or healthy. Thus, the validity of any test’s 
results must be measured against a gold stan- 
dard: a test that reveals the patient’s true disease 
state, such as a biopsy of diseased tissue or a 
surgical operation. A gold-standard test is a pro- 
cedure that is used to define unequivocally the 
presence or absence of disease. The test whose 
discrimination is being measured is called the 
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index test. The gold-standard test usually is 
more expensive, riskier, or more difficult to per- 
form than is the index test (otherwise, the less 
precise test would not be used at all). 

The performance of the index test is mea- 
sured in a small, select group of patients enrolled 
in a study. We are interested, however, in how the 
test performs in the broader group of patients in 
which it will be used in practice. The test may 
perform differently in the two groups, so we 
make the following distinction: the study popula- 
tion comprises those patients (usually a subset 
of the clinically relevant population) in whom 
test discrimination is measured and reported; 
the clinically relevant population comprises those 
patients in whom a test typically is used. 


3.3.5 Bias in the Measurement of 
Test Characteristics 


We mentioned earlier the problem of referral 
bias. Published estimates of disease prevalence 
(derived from a study population) may differ 
from the prevalence in the clinically relevant 
population because diseased patients are more 
likely to be included in studies than are nondis- 
eased patients. Similarly, published values of 
sensitivity and specificity are derived from 
study populations that may differ from the clin- 
ically relevant populations in terms of average 
level of health and disease prevalence. These 
differences may affect test performance, so the 
reported values may not apply to many pa- 
tients in whom a test is used in clinical practice. 


» Example 3.7 


In the early 1970s, a blood test called the car- 
cinoembryonic antigen (CEA) was touted as 
a screening test for colon cancer. Reports of 
early investigations, performed in selected pa- 
tients, indicated that the test had high sensitiv- 
ity and specificity. Subsequent work, however, 
proved the CEA to be completely valueless as a 
screening blood test for colon cancer. Screening 
tests are used in unselected populations, and 
the differences between the study and clinically 
relevant populations were partly responsible for 
the original miscalculations of the CEA’s TPR 
and TNR (Ransohoff and Feinstein 1978). «a 
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The experience with CEA has been repeated 
with numerous tests. Early measures of test 
discrimination are overly optimistic, and sub- 
sequent test performance is disappointing. 
Problems arise when the TPR and TNR, as 
measured in the study population, do not ap- 
ply to the clinically relevant population. These 
problems usually are the result of bias in the de- 
sign of the initial studies—notably spectrum 
bias, test referral bias, or test interpretation bias. 
Spectrum bias occurs when the study popu- 
lation includes only individuals who have ad- 
vanced disease (“sickest of the sick”) and 
healthy volunteers, as is often the case when a 
test is first being developed. Advanced disease 
may be easier to detect than early disease. For 
example, cancer is easier to detect when it has 
spread throughout the body (metastasized) 
than when it is localized to, say, a small portion 
of the colon. In contrast to the study popula- 
tion, the clinically relevant population will con- 
tain more cases of early disease that are more 
likely to be missed by the index test (FNs). 
Thus, the study population will have an artifac- 
tually low FNR, which produces an artifactu- 
ally high TPR (TPR = 1 — FNR). In addition, 
healthy volunteers are less likely than are pa- 
tients in the clinically relevant population to 
have other diseases that may cause FP results®; 
the study population will have an artificially 
low FPR, and therefore the specificity will be 
overestimated (TNR = 1 — FPR). Inaccuracies 
in early estimates of the TPR and TNR of the 
CEA were partly due to spectrum bias. 
Test-referral bias (sometimes referred to as 
referral bias) occurs when a positive index test 
is a criterion for ordering the gold standard 
test. In clinical practice, patients with negative 
index tests are less likely to undergo the gold 


6 Volunteers are often healthy, whereas patients in the 
clinically relevant population often have several dis- 
eases in addition to the disease for which a test is 
designed. These other diseases may cause FP test 
results. For example, patients with benign (rather 
than malignant) enlargement of their prostate 
glands are more likely than are healthy volunteers to 
have FP elevations of prostate-specific antigen 
(Meigs et al. 1996), a substance in the blood that is 
elevated in men who have prostate cancer. Measure- 
ment of prostate-specific antigen is often used to 
detect prostate cancer. 


standard test than are patients with positive 
tests. In other words, the study population, 
comprising individuals with positive index-test 
results, has a higher percentage of patients with 
disease than does the clinically relevant popula- 
tion. Therefore, both TN and FN tests will be 
underrepresented in the study population. The 
result is overestimation of the TPR and under- 
estimation of the TNR in the study population. 

Test-interpretation bias develops when the 
interpretation of the index test affects that of the 
gold standard test or vice versa. This bias causes 
an artificial concordance between the tests (the 
results are more likely to be the same) and spuri- 
ously increases measures of concordance—the 
sensitivity and specificity—in the study popula- 
tion. (Remember, the relative frequencies of TPs 
and TNs are the basis for measures of concor- 
dance). To avoid these problems, the person in- 
terpreting the index test should be unaware of 
the results of the gold standard test. 

To counter these three biases, you may 
need to adjust the TPR and TNR when they 
are applied to a new population. All the biases 
result in a TPR that is higher in the study pop- 
ulation than it is in the clinically relevant pop- 
ulation. Thus, if you suspect bias, you should 
adjust the TPR (sensitivity) downward when 
you apply it to a new population. 

Adjustment of the TNR (specificity) de- 
pends on which type of bias is present. 
Spectrum bias and test interpretation bias re- 
sult ina TNR that is higher in the study popu- 
lation than it will be in the clinically relevant 
population. Thus, if these biases are present, 
you should adjust the specificity downward 
when you apply it to a new population. Test- 
referral bias, on the other hand, produces a 
measured specificity in the study population 
that is lower than it will be in the clinically rel- 
evant population. If you suspect test referral 
bias, you should adjust the specificity upward 
when you apply it to a new population. 


3.3.6 Meta-Analysis of Diagnostic 
Tests 


Often, there are many studies that evaluate the 
sensitivity and specificity of the same diagnostic 
test. If the studies come to similar conclusions 
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about the sensitivity and specificity of the test, 
you can have increased confidence in the results 
of the studies. But what if the studies disagree? 
For example, by 1995, over 100 studies had as- 
sessed the sensitivity and specificity of the PCR 
for diagnosis of HIV (Owens et al. 1996a, b); 
these studies estimated the sensitivity of PCR to 
be as low as 10% and to be as high as 100%, and 
they assessed the specificity of PCR to be be- 
tween 40 and 100%. Which results should you 
believe? One approach that you can use is to as- 
sess the quality of the studies and to use the es- 
timates from the highest-quality studies. 

For evaluation of PCR, however, even the 
high-quality studies did not agree. Another ap- 
proach is to perform a meta-analysis: a study 
that combines quantitatively the estimates 
from individual studies to develop a summary 
ROC curve (Moses et al. 1993; Owens et al. 
1996a, b; Hellmich et al. 1999; Leeflang et al. 
2008; Leeflang 2014). Investigators develop a 
summary ROC curve by using estimates from 
many studies, in contrast to the type of ROC 
curve discussed in > Sect. 3.3.3, which is devel- 
oped from the data in a single study. Summary 
ROC curves provide the best available ap- 
proach to synthesizing data from many studies. 

> Section 3.3 has dealt with the second 
step in the diagnostic process: acquisition of 
further information with diagnostic tests. We 
have learned how to characterize the perfor- 
mance of a test with sensitivity (TPR) and spec- 
ificity (TNR). These measures reveal the 
probability of a test result given the true state of 
the patient. They do not, however, answer the 
clinically relevant question posed in the opening 
example: Given a positive test result, what is the 
probability that this patient has the disease? To 
answer this question, we must learn methods to 
calculate the post-test probability of disease. 


3.4 Post-test Probability: Bayes’ 
Theorem and Predictive Value 


The third stage of the diagnostic process (see 
OFig. 3.1a) is to adjust our probability esti- 
mate to take into account the new informa- 
tion gained from diagnostic tests by calculating 
the post-test probability. 
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3.4.1 Bayes’Theorem 


As we noted earlier in this chapter, a clinician can 
use the disease prevalence in the patient popula- 
tion as an initial estimate of the pretest risk of 
disease. Once clinicians begin to accumulate in- 
formation about a patient, however, they revise 
their estimate of the probability of disease. The 
revised estimate (rather than the disease preva- 
lence in the general population) becomes the pre- 
test probability for the test that they perform. 
After they have gathered more information with 
a diagnostic test, they can calculate the post-test 
probability of disease with Bayes’ theorem. 
Bayes’ theorem is a quantitative method 
for calculating post-test probability using the 
pretest probability and the sensitivity and 
specificity of the test. The theorem is derived 
from the definition of conditional probability 
and from the properties of probability (see the 
Appendix to this chapter for the derivation). 
Recall that a conditional probability is the 
probability that event A will occur given that 
event B is known to occur (see > Sect. 3.2). In 
general, we want to know the probability that 
disease is present (event A), given that the test is 
known to be positive (event B). We denote the 
presence of disease as D, its absence as — D, a 
test result as R, and the pretest probability of 
disease as p[D]. The probability of disease, given 
a test result, is written p[D|R]. Bayes’ theorem is: 


p [D]xp [RD] 
p [D]x p[RID]+p [-D]xp [R|-D] 


p[DIR] = 


We can reformulate this general equation in 
terms of a positive test, (+), by substituting 
p[D|+] for p[D|R], pl+ID] for p[RID], p[+| - 
D] for p[R| — D], and 1 - p[D] for p|- D]. 
From > Sect. 3.3, recall that p[+|D] = TPR 
and p[+| — D] = FPR. Substitution provides 
Bayes’ theorem for a positive test: 


p [D]x TPR 
p [D]« TPR +(1-p [D])x FPR 


p [D|+]= 
We can use a similar derivation to develop 
Bayes’ theorem for a negative test: 

p [D]xFNR 
p[D]x FNR +(1-p [D])xTNR 


p [D|-]= 
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» Example 3.8 


We are now able to calculate the clinically im- 
portant probability in » Example 3.4: the post- 
test probability of heart disease after a positive 
exercise test. At the end of » Sect. 3.2.2, we esti- 
mated the pretest probability of heart disease as 
0.95, based on the prevalence of heart disease in 
men who have typical symptoms of heart disease 
and on the prevalence in people with a family 
history of heart disease. Assume that the TPR 
and FPR of the exercise stress test are 0.65 and 
0.20, respectively. Substituting in Bayes’ formula 
for a positive test, we obtain the probability of 
heart disease given a positive test result: 


0.95 x 0.65 Be 
0.95x 0.65+0.05x0.20  ° << 


p[D|+]= 


Thus, the positive test raised the post-test prob- 
ability to 0.98 from the pretest probability of 
0.95. The change in probability is modest be- 
cause the pretest probability was high (0.95) and 
because the FPR also is high (0.20). If we repeat 
the calculation with a pretest probability of 0.75, 
the post-test probability is 0.91. If we assume 
the FPR of the test to be 0.05 instead of 0.20, a 
pretest probability of 0.95 changes to 0.996. 


3.4.2 The Odds-Ratio Form of Bayes’ 
Theorem and Likelihood 
Ratios 


Although the formula for Bayes’ theorem is 
straightforward, it is awkward for mental cal- 
culations. We can develop a more convenient 
form of Bayes’ theorem by expressing proba- 
bility as odds and by using a different measure 
of test discrimination. Probability and odds 
are related as follows: 


odds = P ; 
l-p 
odds 
p= . 
1+ odds 


Thus, if the probability of rain today is 0.75, 
the odds are 3:1. Thus, on similar days, we 
should expect rain to occur three times for 
each time it does not occur. 

A simple relationship exists between 
pre-test odds and post-test odds: 


post-test odds = pretest odds x likelihood ratio 
or 


PIR] _ P[D] PLR] 
PED] p[-P]” p[RI-D] 


This equation is the odds-ratio form of Bayes 
theorem.’ It can be derived in a straightfor- 
ward fashion from the definitions of Bayes’ 
theorem and of conditional probability that 
we provided earlier. Thus, to obtain the post- 
test odds, we simply multiply the pre-test odds 
by the likelihood ratio (LR) for the test in 
question. 

The LR of a test combines the measures 
of test discrimination discussed earlier to give 
one number that characterizes the discrimina- 
tory power of a test, defined as: 


’ 


ur - AREL 
plRI-D] 


probability of result 
_ in diseased people 


probability of result 
in nondiseased people 


The LR indicates the amount that the odds of 
disease change based on the test result. We 
can use the LR to characterize clinical find- 
ings (such as a swollen leg) or a test result. We 
describe the performance of a test that has 
only two possible outcomes (e.g., positive or 
negative) by two LRs: one corresponding to a 
positive test result and the other correspond- 
ing to a negative test. These ratios are abbrevi- 
ated LR+ and LR-, respectively. 


probability that test 
is positive in 
diseased people _ TPR 


FPR 


LR+= 


probability that test 
is positive in 
nondiseased people 


7 Some authors refer to this expression as the odds- 
likelihood form of Bayes’ theorem. 


Biomedical Decision Making: Probabilistic Clinical Reasoning 


In a test that discriminates well between dis- 
ease and nondisease, the TPR will be high, the 
FPR will be low, and thus LR+ will be much 
greater than 1. A LR of 1 means that the 
probability of a test result is the same in dis- 
eased and nondiseased individuals; the test 
has no value. Similarly, 


probability that test 
is negative in 
diseased people 
probability that test 
is negative in 
nondiseased people 


_ FNR 
TNR 


LR-= 


A desirable test will have a low FNR and a 
high TNR; therefore, the LR-will be much 
less than 1. 


pm Example 3.9 


We can calculate the post-test probability for 
a positive exercise stress test in a 70 year-old 
woman whose pretest probability is 0.75. The 
pretest odds are: 


p 0.75 
1-p 1-0.75 


0.75 
0.25 


odds 3,or 3:1 


The LR for the stress test is: 


_TPR _ 0.65 _ 


LR+=——= 
FPR 0.20 


3.25 

We can calculate the post-test odds of a posi- 
tive test result using the odds-ratio form of 
Bayes’ theorem: 


post-test odds = 3 x 3.25 =9.75:1 


We can then convert the odds to a probability: 


_ odds _ 9.75 _ 
l+odds 1+9.75 ` < 


As expected, this result agrees with our earlier 
answer (see the discussion of >» Example 3.8). 

The odds-ratio form of Bayes’ theorem 
allows rapid calculation. The LR is a power- 
ful method for characterizing the operating 
characteristics of a test: if you know the 
pretest odds, you can calculate the post-test 
odds in one step. The LR demonstrates that 
a useful test is one that changes the odds of 
disease. 
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3.4.3 Predictive Value of a Test 


An alternative approach for estimation of the 
probability of disease in a person who has a 
positive or negative test is to calculate the pre- 
dictive value of the test. The positive predic- 
tive value (PV+) of a test is the likelihood that 
a patient who hasa positive test result also has 
disease. Thus, PV+can be calculated directly 
from a 2 x 2 contingency table: 


number of diseased patients 
_ with positive test 


PV+ : 
total number of patients 


with a positive test 


From the 2 x 2 contingency table in 8 Table 
3.3, 


TP 


PV+ = —— 
TP+FP 


The negative predictive value (PV-) is the like- 
lihood that a patient with a negative test does 
not have disease: 


number of nondiseased patients 


with negative test 
PV-= = 


Total number of patients 
with a negative test 


From the 2 x 2 contingency table in O Table 3.3, 
_ T 
TN+FN 


pm Example 3.10 


We can calculate the PV of the EJA test from 
the 2 x 2 table that we constructed in > Example 
3.6 (see BTable 3.4) as follows: 


pv+e—*_ -0.97 
98 +3 


297 


"29742 ` 


The probability that antibody is present in a pa- 
tient who has a positive index test (EIA) in this 
study is 0.97; about 97 of 100 patients with a 
positive test will have antibody. The likelihood 
that a patient with a negative index test does 
not have antibody is about 0.99. -~a 
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It is worth reemphasizing the difference between 
PV and sensitivity and specificity, given that 
both are calculated from the 2 x 2 table and they 
often are confused. The sensitivity and specific- 
ity give the probability of a particular test result 
in a patient who has a particular disease state. 
The PV gives the probability of true disease state 
once the patient’s test result is known. 

The PV+ calculated from & Table 3.4 is 0.97, 
so we expect 97 of 100 patients with a positive 
index test actually to have antibody. Yet, in > 
Example 3.1, we found that fewer than one of 
ten patients with a positive test were expected to 
have antibody. What explains the discrepancy in 
these examples? The sensitivity and specificity 
(and, therefore, the LRs) in the two examples are 
identical. The discrepancy is due to an extremely 
important and often overlooked characteristic 
of PV: the PV of a test depends on the preva- 
lence of disease in the study population (the 
prevalence can be calculated as TP + FN divid- 
ed by the total number of patients in the 2 x 2 
table). The PV cannot be generalized to a new 
population because the prevalence of disease 
may differ between the two populations. 

The difference in PV of the EIA in » Exam- 
ple 3.1 and in » Example 3.6 is due to a differ- 
ence in the prevalence of disease in the examples. 
The prevalence of antibody was given as 0.001 in 
> Example 3.1 and as 0.25 in » Example 3.6. 
These examples should remind us that the PV+ 
is not an intrinsic property of a test. Rather, it 
represents the post-test probability of disease 
only when the prevalence is identical to that in 
the 2 x 2 contingency table from which the PV+ 
was calculated. Bayes’ theorem provides a meth- 
od for calculation of the post-test probability of 
disease for any prior probability. For that reason, 
we prefer the use of Bayes’ theorem to calculate 
the post-test probability of disease. 


3.4.4 Implications of Bayes’ 
Theorem 


In this section, we explore the implications of 
Bayes’ theorem for test interpretation. These 
ideas are extremely important, yet they often 
are misunderstood. 

O Figure 3.4 illustrates one of the most es- 
sential concepts in this chapter: The post-test 


Post-test probability 


0.0 0.2 0.4 0.6 0.8 1.0 
Pretest probability 


Post-test probability 


0.0 0.2 0.4 0.6 0.8 1.0 
Pretest probability 


O Fig. 3.4 Relationship between pretest probability 
and post-test probability of disease. The dashed lines 
correspond to a test that has no effect on the probability 
of disease. Sensitivity and specificity of the test were 
assumed to be 0.90 for the two examples. a The post-test 
probability of disease corresponding to a positive test 
result (solid curve) was calculated with Bayes’ theorem 
for all values of pretest probability. b The post-test prob- 
ability of disease corresponding to a negative test result 
(solid curve) was calculated with Bayes’ theorem for all 
values of pretest probability. (Source: Adapted from Sox 
(1987), with permission) 


probability of disease increases as the pretest 
probability of disease increases. We produced 
OFig. 3.4a by calculating the post-test proba- 
bility after a positive test result for all possible 
pretest probabilities of disease. We similarly 
derived Ø Fig. 3.4b for a negative test result. 
The 45-degree line in each figure denotes a 
test in which the pretest and post-test proba- 
bility are equal (LR = 1), indicating a test that 
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is useless. The curve in QFig. 3.4a relates pre- 
test and post-test probabilities in a test with a 
sensitivity and specificity of 0.9. Note that, at 
low pretest probabilities, the post-test proba- 
bility after a positive test result is much higher 
than is the pretest probability. At high pretest 
probabilities, the post-test probability is only 
slightly higher than the pretest probability. 

O Figure 3.4b shows the relationship be- 
tween the pretest and post-test probabilities 
after a negative test result. At high pretest 
probabilities, the post-test probability after a 
negative test result is much lower than is the 
pretest probability. A negative test, however, 
has little effect on the post-test probability if 
the pretest probability is low. 

This discussion emphasizes a key idea of 
this chapter: the interpretation of a test result 
depends on the pretest probability of disease. If 
the pretest probability is low, a positive test re- 
sult has a large effect, and a negative test result 
has a small effect. If the pretest probability is 
high, a positive test result has a small effect, and 
a negative test result has a large effect. In other 
words, when the clinician is almost certain of 
the diagnosis before testing (pretest probabil- 
ity nearly 0 or nearly 1), a confirmatory test 
has little effect on the posterior probability 
(see > Example 3.8). If the pretest probability is 
intermediate or if the result contradicts a strong- 
ly held clinical impression, the test result will 
have a large effect on the post-test probability. 

Note from B Fig. 3.4a that, if the pretest 
probability is very low, a positive test result can 
raise the post-test probability into only the in- 
termediate range. Assume that @Fig. 3.4a rep- 
resents the relationship between the pretest 
and post-test probabilities for the exercise 
stress test. If the clinician believes the pretest 
probability of coronary artery disease is 0.1, 
the post-test probability will be about 0.5. 
Although there has been a large change in the 
probability, the post-test probability is in an 
intermediate range, which leaves considerable 
uncertainty about the diagnosis. Thus, if the 
pretest probability is low, it is unlikely that a 
positive test result will raise the probability of 
disease sufficiently for the clinician to make 
that diagnosis with confidence. An exception 
to this statement occurs when a test has a very 
high specificity (or a large LR+); e.g., HIV an- 
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tibody tests have a specificity greater than 0.99, 
and therefore a positive test is convincing. 
Similarly, if the pretest probability is very high, 
it is unlikely that a negative test result will low- 
er the post-test probability sufficiently to ex- 
clude a diagnosis. 

O Figure 3.5 illustrates another important 
concept: test specificity affects primarily the in- 


a TNR=0.98 
0.8 — 
> _| 
5 06— 
3 
Q _| 
a 7 
2 04— 7” TNR=0.80 
g : 
$ “ 
Ê o2 ,⁄° TNR=0.90 
i e TI TNR=0.98 
0.0 — 
le | 
0.0 0.2 0.4 0.6 0.8 1.0 
Pretest probability 
b 1.0 
_|  TPR=0.95 
0.8 — 
x _| 
SS 06— 
3 
8 _| 
a 
g 04—- 
g 
z — 
È o02— 
0.0 
LITICI] 


0.0 0.2 0.4 0.6 0.8 1.0 
Pretest probability 


O Fig. 3.5 Effects of test sensitivity and specificity on 
post-test probability. The curves are similar to those shown 
in GFig. 3.4 except that the calculations have been repeated 
for several values of the sensitivity (TPR true-positive rate) 
and specificity (TNR true-negative rate) of the test. a The 
sensitivity of the test was assumed to be 0.90, and the calcu- 
lations were repeated for several values of test specificity. b 
The specificity of the test was assumed to be 0.90, and the 
calculations were repeated for several values of the sensitiv- 
ity of the test. In both panels, the top family of curves cor- 
responds to positive test results, and the bottom family of 
curves corresponds to negative test results. (Source: 
Adapted from Sox (1987), with permission) 
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terpretation of a positive test; test sensitivity 
affects primarily the interpretation of a nega- 
tive test. In both parts (a) and (b) of B Fig. 3.5, 
the top family of curves corresponds to positive 
test results and the bottom family to negative 
test results. @ Figure 3.5a shows the post-test 
probabilities for tests with varying specificities 
(TNR). Note that changes in the specificity 
produce large changes in the top family of 
curves (positive test results) but have little effect 
on the lower family of curves (negative test re- 
sults). That is, an increase in the specificity of a 
test markedly changes the post-test probability 
if the test is positive but has relatively little ef- 
fect on the post-test probability if the test is 
negative. Thus, if you are trying to rule in a di- 
agnosis, you should choose a test with high 
specificity or a high LR+. @ Figure 3.5b shows 
the post-test probabilities for tests with varying 
sensitivities. Note that changes in sensitivity 
produce large changes in the bottom family of 
curves (negative test results) but have little ef- 
fect on the top family of curves. Thus, if you 
are trying to exclude a disease, choose a test 
with a high sensitivity or a high LR-. 


3.4.5 Cautions in the Application of 
Bayes’ Theorem 


Bayes’ theorem provides a powerful method 
for calculating post-test probability. You should 
be aware, however, of the possible errors you 
can make when you use it. Common problems 


8 In medicine, to rule in a disease is to confirm that the 
patient does have the disease; to rule out a disease is 
to confirm that the patient does not have the disease. 
A doctor who strongly suspects that his or her 
patient has a bacterial infection orders a culture to 
rule in his or her diagnosis. Another doctor is almost 
certain that his or her patient has a simple sore 
throat but orders a culture to rule out streptococcal 
infection (strep throat). This terminology oversim- 
plifies a diagnostic process that is probabilistic. 
Diagnostic tests rarely, if ever, rule in or rule out a 
disease; rather, the tests raise or lower the probabil- 
ity of disease. 


are inaccurate estimation of pretest probabil- 
ity, faulty application of test-performance 
measures, and violation of the assumptions of 
conditional independence and of mutual ex- 
clusivity. 

Bayes’ theorem provides a means to adjust 
an estimate of pretest probability to take into 
account new information. The accuracy of 
the calculated post-test probability is limited, 
however, by the accuracy of the estimated pre- 
test probability. Accuracy of estimated prior 
probability is increased by proper use of pub- 
lished prevalence rates, heuristics, and clinical 
prediction rules. In a decision analysis, as we 
shall see, a range of prior probability often is 
sufficient. Nonetheless, if the pretest probabil- 
ity assessment is unreliable, Bayes’ theorem 
will be of little value. 

A second potential mistake that you can 
make when using Bayes’ theorem is to apply 
published values for the test sensitivity and 
specificity, or LRs, without paying attention 
to the possible effects of bias in the studies in 
which the test performance was measured 
(see > Sect. 3.3.5). With certain tests, the LRs 
may differ depending on the pretest odds in 
part because differences in pretest odds may 
reflect differences in the spectrum of disease 
in the population. 

A third potential problem arises when you 
use Bayes’ theorem to interpret a sequence of 
tests. If a patient undergoes two tests in se- 
quence, you can use the post-test probability 
after the first test result, calculated with 
Bayes’ theorem, as the pretest probability for 
the second test. Then, you use Bayes’ theorem 
a second time to calculate the post-test prob- 
ability after the second test. This approach is 
valid, however, only if the two tests are condi- 
tionally independent. Tests for the same dis- 
ease are conditionally independent when the 
probability of a particular result on the sec- 
ond test does not depend on the result of the 
first test, given (conditioned on) the disease 
state. Expressed in conditional probability 
notation for the case in which the disease is 
present, 
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p[second test positivelfirst test positive and disease present | 


= p|second test positive|first test negative and disease present] 


= p[second test positive|disease present]. 


If the conditional independence assumption 
is satisfied, the post-test odds = pretest odds X 
LR1 x LR2. If you apply Bayes’ theorem se- 
quentially in situations in which conditional 
independence is violated, you will obtain inac- 
curate post-test probabilities. 

The fourth common problem arises when 
you assume that all test abnormalities result 
from one (and only one) disease process. The 
Bayesian approach, as we have described it, 
generally presumes that the diseases under con- 
sideration are mutually exclusive. If they are 
not, Bayesian updating must be applied with 
great care. 

We have shown how to calculate post-test 
probability. In > Sect. 3.5, we turn to the prob- 
lem of decision making when the outcomes of 
a clinician’s actions (e.g., of treatments) are un- 
certain. 


3.5 Expected-Value Decision 
Making 


Medical decision-making problems often can- 
not be solved by reasoning based on patho- 
physiology. For example, clinicians need a 
method for choosing among treatments when 
the outcome of the treatments is uncertain, as 
are the results of a surgical operation. You can 
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use the ideas developed in the preceding sec- 
tions to solve such difficult decision problems. 
Here we discuss two methods: the decision 
tree, a method for representing and comparing 
the expected outcomes of each decision alter- 
native; and the threshold probability, a meth- 
od for deciding whether new information can 
change a management decision. These tech- 
niques help you to clarify the decision problem 
and thus to choose the alternative that is most 
likely to help the patient. 


3.5.1 Comparison of Uncertain 
Prospects 


Like those of most biological events, the out- 
come of an individual’s illness is unpredictable. 
How can a clinician determine which course of 
action has the greatest chance of success? 


» Example 3.11 


There are two available therapies for a fatal illness. 
The length of a patient’s life after either therapy 
is unpredictable, as illustrated by the frequency 
distribution shown in @ Fig. 3.6 and summa- 
rized in @ Table 3.5. Each therapy is associated 
with uncertainty: regardless of which therapy a 
patient receives, the patient will die by the end of 
the fourth year, but there is no way to know which 
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O Fig.3.6 Survival after therapy for a fatal disease. Two therapies are available; the results of either are unpredictable 
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O Table 3.5 Distribution of probabilities for 
the two therapies in B Fig. 3.7 


Probability of 
death 
Years after Therapy A Therapy 
therapy B 
1 0.20 0.05 
2 0.40 0.15 
3 0.30 0.45 
4 0.10 0.35 


year will be the patient’s last. @ Figure 3.6 shows 
that survival until the fourth year is more likely 
with therapy B, but the patient might die in the 
first year with therapy B or might survive to the 
fourth year with therapy A. «a 


Which of the two therapies is preferable? > 
Example 3.11 demonstrates a significant fact: a 
choice among therapies is a choice among gam- 
bles (i.e., situations in which chance determines 
the outcomes). How do we usually choose 
among gambles? More often than not, we rely 
on hunches or on a sixth sense. How should we 
choose among gambles? We propose a method 
for choosing called expected-value decision 
making: we characterize each gamble by a num- 
ber, and we use that number to compare the 
gambles.” In » Example 3.11, therapy A and 
therapy B are both gambles with respect to du- 
ration of life after therapy. We want to assign a 
measure (or number) to each therapy that sum- 
marizes the outcomes such that we can decide 
which therapy is preferable. 

The ideal criterion for choosing a gamble 
should be a number that reflects preferences 
(in medicine, often the patient’s preferences) 
for the outcomes of the gamble. Utility is the 
name given to a measure of preference that 
has a desirable property for decision making: 
the gamble with the highest utility should be 
preferred. We shall discuss utility briefly (> 
Sect. 3.5.4), but you can pursue this topic and 


9  Expected-value decision making had been used in 
many fields before it was first applied to medicine. 


the details of decision analysis in other text- 
books (see Suggested Readings at the end of 
this chapter).!° We use the average duration of 
life after therapy (survival) as a criterion for 
choosing among therapies; remember that 
this model is oversimplified, used here for dis- 
cussion only. Later, we consider other factors, 
such as the quality of life. 

Because we cannot be sure of the duration 
of survival for any given patient, we charac- 
terize a therapy by the mean survival (average 
length of life) that would be observed in a 
large number of patients after they were given 
the therapy. The first step we take in calculat- 
ing the mean survival for a therapy is to divide 
the population receiving the therapy into 
groups of patients who have similar survival 
rates. Then, we multiply the survival time in 
each group!! by the fraction of the total pop- 
ulation in that group. Finally, we sum these 
products over all possible survival values. 

We can perform this calculation for the 
therapies in ® Example 3.11. Mean survival 
for therapy A = (0.2 x 1.0) + (0.4 x 2.0) + (0.3 
x 3.0) + (0.1 x 4.0) = 2.3 years. Mean survival 
for therapy B = (0.05 x 1.0) + (0.15 x 2.0) + 
(0.45 x 3.0) + (0.35 x 4.0) = 3.1 years. 

Survival after a therapy is under the control 
of chance. Therapy A is a gamble character- 
ized by an average survival equal to 2.3 years. 
Therapy B is a gamble characterized by an av- 
erage survival of 3.1 years. If length of life is 
our criterion for choosing, we should select 
therapy B. 


3.5.2 Representation of Choices 
with Decision Trees 


The choice between therapies A and B is repre- 
sented diagrammatically in @ Fig. 3.7. Events 
that are under the control of chance can be rep- 
resented by a chance node. By convention, a 


10 A more general term for expected-value decision 
making is expected utility decision making. Because 
a full treatment of utility is beyond the scope of this 
chapter, we have chosen to use the term expected 
value. 

11 For this simple example, death during an interval is 
assumed to occur at the end of the year. 


Biomedical Decision Making: Probabilistic Clinical Reasoning 


p=0.20 Survive 
1 year 
p=0A0 survive 
Expected 2 years 
survival = 
2.3 year: 
p=0.30 Survive 
3 years 
P=0.10 survive 
4 years 


Treatment A 


101 


Survive 
1 year 


p=0.05 


p=0.15 Survive 


2 
Expected years 


survival = 


3.1 years 


p=0.45 Survive 


3 years 


P=035 survive 


4 years 


Treatment B 


O Fig. 3.7 A chance-node representation of survival after the two therapies in Ø Fig. 3.6. The probabilities times 
the corresponding years of survival are summed to obtain the total expected survival 


chance node is shown as a circle from which 
several lines emanate. Each line represents one 
of the possible outcomes. Associated with each 
line is the probability of the outcome occurring. 
For a single patient, only one outcome can oc- 
cur. Some physicians object to using probability 
for just this reason: “You cannot rely on popu- 
lation data, because each patient is an individu- 
al.” In fact, we often must use the frequency of 
the outcomes of many patients experiencing the 
same event to inform our opinion about what 
might happen to an individual. From these fre- 
quencies, we can make patient-specific adjust- 
ments and thus estimate the probability of each 
outcome at a chance node. 

A chance node can represent more than 
just an event governed by chance. The out- 
come of a chance event, unknowable for the 
individual, can be represented by the expected 
value at the chance node. The concept of ex- 
pected value is important and is easy to un- 
derstand. We can calculate the mean survival 
that would be expected based on the probabil- 
ities depicted by the chance node in Ø Fig. 
3.7. This average length of life is called the 
expected survival or, more generally, the ex- 
pected value of the chance node. We calculate 
the expected value at a chance node by the 
process just described: we multiply the sur- 
vival value associated with each possible out- 
come by the probability that that outcome will 
occur. We then sum the product of probability 
times survival over all outcomes. Thus, if sev- 


eral hundred patients were assigned to receive 
either therapy A or therapy B, the expected 
survival would be 2.3 years for therapy A and 
3.1 years for therapy B. 

We have just described the basis of expect- 
ed-value decision making. The term expected 
value is used to characterize a chance event, 
such as the outcome of a therapy. If the out- 
comes of a therapy are measured in units of 
duration of survival, units of sense of well- 
being, or dollars, the therapy is characterized 
by the expected duration of survival, expected 
sense of well-being, or expected monetary 
cost that it will confer on, or incur for, the pa- 
tient, respectively. 

To use expected-value decision making, we 
follow this strategy when there are therapy 
choices with uncertain outcomes: (1) calculate 
the expected value of each decision alternative 
and then (2) pick the alternative with the high- 
est expected value. 


3.5.3 Performance of a Decision 
Analysis 


We clarify the concepts of expected-value de- 
cision making by discussing an example. 
There are four steps in decision analysis: 

1. Create a decision tree; this step is the most 
difficult, because it requires formulating 
the decision problem, assigning probabili- 
ties, and measuring outcomes. 
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2. Calculate the expected value of each deci- 
sion alternative. 

3. Choose the decision alternative with the 
highest expected value. 

4. Use sensitivity analysis to test the conclu- 
sions of the analysis. 


Some health professionals hesitate when they 
first learn about the technique of decision 
analysis, because they recognize the opportu- 
nity for error in assigning values to both the 
probabilities and the utilities in a decision 
tree. They reason that the technique encour- 
ages decision making based on small differ- 
ences in expected values that are estimates at 
best. The defense against this concern, which 
also has been recognized by decision analysts, 
is the technique known as sensitivity analysis. 
We discuss this important fourth step in deci- 
sion analysis in ® Sect. 3.5.5. In addition, de- 
cision analysis helps make the assumptions 
underlying a decision explicit, so that the as- 
sumptions can be assessed carefully. 

The first step in decision analysis is to create 
a decision tree that represents the decision prob- 
lem. Consider the following clinical problem. 


» Example 3.12 


The patient is Mr. Danby, a 66-year-old man 
who has been crippled with arthritis of both 
knees so severely that, while he can get about the 
house with the aid of two canes, he must oth- 
erwise use a wheelchair. His other major health 
problem is emphysema, a disease in which the 
lungs lose their ability to exchange oxygen and 
carbon dioxide between blood and air, which in 
turn causes shortness of breath (dyspnea). He 
is able to breathe comfortably when he is in a 
wheelchair, but the effort of walking with canes 
makes him breathe heavily and feel uncom- 
fortable. Several years ago, he seriously con- 
sidered knee replacement surgery but decided 
against it, largely because his internist told him 
that there was a serious risk that he would not 
survive the operation because of his lung dis- 
ease. Recently, however, Mr. Danby’s wife had 
a stroke and was partially paralyzed; she now 
requires a degree of assistance that the patient 
cannot supply given his present state of mobil- 
ity. He tells his doctor that he is reconsidering 
knee replacement surgery. 


Mr. Danby’s internist is familiar with deci- 
sion analysis. She recognizes that this problem 
is filled with uncertainty: Mr. Danby’s ability to 
survive the operation is in doubt, and the sur- 
gery sometimes does not restore mobility to the 
degree required by such a patient. Furthermore, 
there is a small chance that the prosthesis (the 
artificial knee) will become infected, and Mr. 
Danby then would have to undergo a second 
risky operation to remove it. After removal of 
the prosthesis, Mr. Danby would never again be 
able to walk, even with canes. The possible out- 
comes of knee replacement include death from 
the first procedure and death from a second 
mandatory procedure if the prosthesis becomes 
infected (which we will assume occurs in the 
immediate postoperative period, if it occurs at 
all). Possible functional outcomes include 
recovery of full mobility or continued, and 
unchanged, poor mobility. Should Mr. Danby 
choose to undergo knee replacement surgery, 
or should he accept the status quo? ~a 


Using the conventions of decision analysis, 
the internist sketches the decision tree shown 
in QFig. 3.8. According to these conventions, 
a square box denotes a decision node, and 
each line emanating from a decision node rep- 
resents an action that could be taken. 

According to the methods of expected- 
value decision making, the internist first must 
assign a probability to each branch of each 
chance node. To accomplish this task, the inter- 
nist asks several orthopedic surgeons for their 
estimates of the chance of recovering full func- 
tion after surgery (p[full recovery] = 0.60) and 
the chance of developing infection in the pros- 
thetic joint (plinfection] = 0.05). She uses her 
subjective estimate of the probability that the 
patient will die during or immediately after 
knee surgery (p[operative death] = 0.05). 

Next, she must assign a value to each out- 
come. To accomplish this task, she first lists the 
outcomes. As you can see from B Table 3.6, the 
outcomes differ in two dimensions: length of life 
(survival) and quality of life (functional status). 
To characterize each outcome accurately, the in- 
ternist must develop a measure that takes into 
account these two dimensions. Simply using du- 
ration of survival is inadequate because Mr. 
Danby values 5 years of good health more than 
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Operative death 


Surgery 


Survival 


No surgery 
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Operative death 


Infection 


Survival 


Full mobility 


No infection 


Poor mobility 


O Fig. 3.8 Decision tree for knee replacement surgery. The box represents the decision node (whether to have 


surgery); the circles represent chance nodes 


O Table 3.6 Outcomes for ® Example 3.12 


Survival Functional status Years of full 
(years) function 
equivalent to 
outcome 
10 Full mobility 10 
(successful surgery) 
10 Poor mobility 6 
(status quo or 
unsuccessful 
surgery) 
10 Wheelchair-bound 3 
(the outcome if a 
second surgery is 
necessary) 
0 Death 0 


he values 10 years of poor health. The internist 
can account for this trade-off factor by convert- 
ing outcomes with two dimensions into out- 
comes with a single dimension: duration of 
survival in good health. The resulting measure is 
called a quality-adjusted life year (QALY). !? 

She can convert years in poor health into 
years in good health by asking Mr. Danby to 
indicate the shortest period in good health 


12 QALYs commonly are used as measures of utility 
(value) in medical decision analysis and in health 
policy analysis. 


(full mobility) that he would accept in return 
for his full expected lifetime (10 years) in a 
state of poor health (status quo). Thus, she 
asks Mr. Danby: “Many people say they 
would be willing to accept a shorter life in ex- 
cellent health in preference to a longer life 
with significant disability. In your case, how 
many years with normal mobility do you feel 
is equivalent in value to 10 years in your cur- 
rent state of disability?” She asks him this 
question for each outcome. The patient’s re- 
sponses are shown in the third column of B 
Table 3.6. The patient decides that 10 years of 
limited mobility are equivalent to 6 years of 
normal mobility, whereas 10 years of wheel- 
chair confinement are equivalent to only 3 
years of full function. @Figure 3.9 shows the 
final decision tree—complete with probability 
estimates and utility values for each out- 
come. !? 

The second task that the internist must un- 
dertake is to calculate the expected value, in 
healthy years, of surgery and of no surgery. 
She calculates the expected value at each 
chance node, moving from right (the tips of 


13 In a more sophisticated decision analysis, the clini- 
cian also would adjust the utility values of outcomes 
that require surgery to account for the pain and 
inconvenience associated with surgery and rehabili- 
tation. Other approaches to assessing utility are 
available and may be preferable in some circum- 
stances. 
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Operative death 


Infection 
Surgery 


Survival 


No surgery 


O Fig. 3.9 Decision tree for knee-replacement surgery. 
Probabilities have been assigned to each branch of each 
chance node. The patient’s valuations of outcomes 


the tree) to left (the root of the tree). Let us 

consider, for example, the expected value at 

the chance node representing the outcome of 
surgery to remove an infected prosthesis 

(Node A in @ Fig. 3.9). The calculation re- 

quires three steps: 

1. Calculate the expected value of operative 
death after surgery to remove an infected 
prosthesis. Multiply the probability of op- 
erative death (0.05) by the QALY of the 
outcome—death (0 years): 0.05 x 0 = 0 
QALY. 

2. Calculate the expected value of surviving 
surgery to remove an infected knee prosthe- 
sis. Multiply the probability of surviving the 
operation (0.95) by the number of healthy 
years equivalent to 10 years of being wheel- 
chair-bound (3 years): 0.95 x 3 = 2.85 QA- 
LYs. 

3. Add the expected values calculated in step 
1 (0 QALY) and step 2 (2.85 QALYs) to 
obtain the expected value of developing an 
infected prosthesis: 0 + 2.85 = 2.85 QALYs. 


Similarly, the expected value at chance node B 
is calculated: (0.6 x 10) + (0.4 x 6) =8.4 
QALYs. To obtain the expected value of sur- 


Wheelchair-bound | 3 
Full mobility 10 


Poor mobility 6 
Poor mobility 6 


(measured in years of perfect mobility) are assigned to 
the tips of each branch of the tree 


Survival 


Full mobility 


viving knee replacement surgery (Node C), 

she proceeds as follows: 

1. Multiply the expected value of an infected 
prosthesis (already calculated as 2.85 
QALY’) by the probability that the prosthe- 
sis will become infected (0.05): 2.85 x 0.05 = 
0.143 QALYs. 

2. Multiply the expected value of never de- 
veloping an infected prosthesis (already 
calculated as 8.4 QALYs) by the probabil- 
ity that the prosthesis will not become in- 
fected (0.95): 8.4 x 0.95 = 7.98 QALYs. 

3. Add the expected values calculated in step 
1 (0.143 QALY) and step 2 (7.98 QALYs) 
to get the expected value of surviving knee 
replacement surgery: 0.143 + 7.98 = 8.123 
QALYs. 


The clinician performs this process, called av- 
eraging out at chance nodes, for node D as 
well, working back to the root of the tree, un- 
til the expected value of surgery has been cal- 
culated. The outcome of the analysis is as 
follows. For surgery, Mr. Danby’s average life 
expectancy, measured in years of normal mo- 
bility, is 7.7. What does this value mean? It 
does not mean that, by accepting surgery, 
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Mr. Danby is guaranteed 7.7 years of mobile 
life. One look at the decision tree will show 
that some patients die in surgery, some devel- 
op infection, and some do not gain any 
improvement in mobility after surgery. Thus, 
an individual patient has no guarantees. If the 
clinician had 100 similar patients who under- 
went the surgery, however, the average number 
of mobile years would be 7.7. We can under- 
stand what this value means for Mr. Danby 
only by examining the alternative: no surgery. 

In the analysis for no surgery, the average 
length of life, measured in years of normal 
mobility, is 6.0, which Mr. Danby considered 
equivalent to 10 years of continued poor mo- 
bility. Not all patients will experience this out- 
come; some who have poor mobility will live 
longer than, and some will live less than, 10 
years. The average length of life, however, ex- 
pressed in years of normal mobility, will be 6. 
Because 6.0 is less than 7.7, on average the 
surgery will provide an outcome with higher 
value to the patient. Thus, the internist recom- 
mends performing the surgery. 

The key insight of expected-value decision 
making should be clear from this example: giv- 
en the unpredictable outcome in an individual, 
the best choice for the individual is the alterna- 
tive that gives the best result on the average in 
similar patients. Decision analysis can help the 
clinician to identify the therapy that will give 
the best results when averaged over many simi- 
lar patients. The decision analysis is tailored to 
a specific patient in that both the utility func- 
tions and the probability estimates are adjusted 
to the individual. Nonetheless, the results of 
the analysis represent the outcomes that would 
occur on average in a population of patients 
who have similar utilities and for whom uncer- 
tain events have similar probabilities. 


3.5.4 Representation of Patients’ 
Preferences with Utilities 


In » Sect. 3.5.3, we introduced the concept of 
QALYs, because length of life is not the only 
outcome about which patients care. Patients’ 
preferences for a health outcome may depend 
on the length of life with the outcome, on the 
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quality of life with the outcome, and on the risk 
involved in achieving the outcome (e.g., a cure 
for cancer might require a risky surgical opera- 
tion). How can we incorporate these elements 
into a decision analysis? To do so, we can repre- 
sent patients’ preferences with utilities. The util- 
ity of a health state is a quantitative measure of 
the desirability of a health state from the pa- 
tient’s perspective. Utilities are typically ex- 
pressed on a 0 to 1 scale, where 0 represents 
death and 1 represents ideal health. For exam- 
ple, a study of patients who had chest pain (an- 
gina) with exercise rated the utility of mild, 
moderate, and severe angina as 0.95, 0.92, and 
0.82 (Nease et al. 1995), respectively. There are 
several methods for assessing utilities. 

The standard-gamble technique has the 
strongest theoretical basis of the various ap- 
proaches to utility assessment, as shown by 
Von Neumann and Morgenstern and described 
by Sox et al. (1988). To illustrate use of the 
standard gamble, suppose we seek to assess a 
person’s utility for the health state of asymp- 
tomatic HIV infection. To use the standard 
gamble, we ask our subject to compare the de- 
sirability of asymptomatic HIV infection to 
those of two other health states whose utility 
we know or can assign. Often, we use ideal 
health (assigned a utility of 1) and immediate 
death (assigned a utility of 0) for the compari- 
son of health states. We then ask our subject to 
choose between asymptomatic HIV infection 
and a gamble with a chance of ideal health or 
immediate death. We vary the probability of 
ideal health and immediate death systemati- 
cally until the subject is indifferent between as- 
ymptomatic HIV infection and the gamble. For 
example, a subject might be indifferent when 
the probability of ideal health is 0.8 and the 
probability of death is 0.2. At this point of in- 
difference, the utility of the gamble and that of 
asymptomatic HIV infection are equal. We cal- 
culate the utility of the gamble as the weighted 
average of the utilities of each outcome of the 
gamble [(1 x 0.8) + (0 x 0.2)] = 0.8. Thus in this 
example, the utility of asymptomatic HIV in- 
fection is 0.8. Use of the standard gamble en- 
ables an analyst to assess the utility of outcomes 
that differ in length or quality of life. Because 
the standard gamble involves chance events, it 
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also assesses a person’s willingness to take 
risks—called the person’s risk attitude. 

A second common approach to utility as- 
sessment is the time-trade-off technique (Sox et 
al. 1988; Torrance and Feeny 1989). To assess 
the utility of asymptomatic HIV infection us- 
ing the time-trade-off technique, we ask a per- 
son to determine the length of time in a better 
state of health (usually ideal health or best at- 
tainable health) that he or she would find 
equivalent to a longer period of time with as- 
ymptomatic HIV infection. For example, if our 
subject says that 8 months of life with ideal 
health was equivalent to 12 months of life with 
asymptomatic HIV infection, then we calculate 
the utility of asymptomatic HIV infection as 
8-12 = 0.67. The time-trade-off technique 
provides a convenient method for valuing out- 
comes that accounts for gains (or losses) in 
both length and quality of life. Because the 
time trade-off does not include gambles, how- 
ever, it does not assess a person’s risk attitude. 
Perhaps the strongest assumption underlying 
the use of the time trade-off as a measure of 
utility is that people are risk neutral. A risk- 
neutral decision maker is indifferent between 
the expected value of a gamble and the gamble 
itself. For example, a risk-neutral decision 
maker would be indifferent between the choice 
of living 20 years (for certain) and that of tak- 
ing a gamble with a 50% chance of living 40 
years and a 50% chance of immediate death 
(which has an expected value of 20 years). In 
practice, of course, few people are risk-neutral. 
Nonetheless, the time-trade-off technique is 
used frequently to value health outcomes be- 
cause it is relatively easy to understand. 

Several other approaches are available to 
value health outcomes. To use the visual analog 
scale, a person simply rates the quality of life 
with a health outcome (e.g., asymptomatic 
HIV infection) on a scale from 0 to 100. 
Although the visual analog scale is easy to ex- 
plain and use, it has no theoretical justification 
as a valid measure of utility. Ratings with the 
visual analog scale, however, correlate modest- 
ly well with utilities assessed by the standard 
gamble and time trade-off. For a demonstra- 
tion of the use of standard gambles, time trade- 
offs, and the visual analog scale to assess 
utilities in patients with angina, see Nease et al. 


(1995); in patient living with HIV, see Joyce et 
al. (2009) and (2012). Other approaches to 
valuing health outcomes include the Quality of 
Well-Being Scale, the Health Utilities Index, 
and the EuroQoL (see Neumann et al. 2017, 
ch. 7). Each of these instruments assesses how 
people value health outcomes and therefore 
may be appropriate for use in decision analyses 
or cost-effectiveness analyses. 

In summary, we can use utilities to represent 
how patients value complicated health out- 
comes that differ in length and quality of life 
and in riskiness. Computer-based tools with an 
interactive format have been developed for as- 
sessing utilities; they often include text and mul- 
timedia presentations that enhance patients’ 
understanding of the assessment tasks and of 
the health outcomes (Sumner et al. 1991; Nease 
and Owens 1994; Lenert et al. 1995). 


3.5.5 Performance of Sensitivity 
Analysis 


Sensitivity analysis is a test of the robustness 
of the conclusions of an analysis over a wide 
range of assumptions about the probabilities 
and the values, or utilities. The probability of 
an outcome at a chance node may be the best 
estimate that is available, but there often is a 
wide range of reasonable probabilities that a 
clinician could use with nearly equal confi- 
dence. We use sensitivity analysis to answer 
this question: Do my conclusions regarding 
the preferred choice change when the proba- 
bility and outcome estimates are assigned val- 
ues that lie within a reasonable range? 

The knee-replacement decision in > Exam- 
ple 3.12 illustrates the power of sensitivity 
analysis. If the conclusions of the analysis 
(surgery is preferable to no surgery) remain the 
same despite a wide range of assumed values 
for the probabilities and outcome measures, 
the recommendation is trustworthy. @ Figures 
3.10 and 3.11 show the expected survival in 
healthy years with surgery and without surgery 
under varying assumptions of the probability 
of operative death and the probability of at- 
taining perfect mobility, respectively. Each 
point (value) on these lines represents one cal- 
culation of expected survival using the tree in 
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Expected years of healthy life 


0 0.25 0.5 
Probability of operative death 


O Fig. 3.10 Sensitivity analysis of the effect of operative 
mortality on length of healthy life (> Example 3.12). As the 
probability of operative death increases, the relative values 
of surgery versus no surgery change. The point at which the 
two lines cross represents the probability of operative death 
at which no surgery becomes preferable. The solid line repre- 
sents the preferred option at a given probability 


OFig. 3.8. B Figure 3.10 shows that expected 
survival is higher with surgery over a wide 
range of operative mortality rates. Expected 
survival is lower with surgery, however, when 
the operative mortality rate exceeds 25%. © 
Figure 3.11 shows the effect of varying the 
probability that the operation will lead to per- 
fect mobility. The expected survival, in healthy 
years, is higher for surgery as long as the prob- 
ability of perfect mobility exceeds 20%, a much 
lower figure than is expected from previous ex- 
perience with the operation. (In > Example 
3.12, the consulting orthopedic surgeons esti- 
mated the chance of full recovery at 60%). 
Thus, the internist can proceed with confidence 
to recommend surgery. Mr. Danby cannot be 
sure of a good outcome, but he has valid rea- 
sons for thinking that he is more likely to do 
well with surgery than he is without it. 
Another way to state the conclusions of a 
sensitivity analysis is to indicate the range of 
probabilities over which the conclusions apply. 
The point at which the two lines in @Fig. 3.10 
cross is the probability of operative death at 
which the two therapy options have the same 
expected survival. If expected survival is to be 
the basis for choosing therapy, the internist 
and the patient should be indifferent between 
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Expected years of healthy life 


0 0.5 1.0 
Probability of perfect mobility 


O Fig. 3.11 Sensitivity analysis of the effect of a suc- 
cessful operative result on length of healthy life (> Exam- 
ple 3.12). As the probability of a successful surgical result 
increases, the relative values of surgery versus no surgery 
change. The point at which the two lines cross represents 
the probability of a successful result at which surgery 
becomes preferable. The solid line represents the preferred 
option at a given probability 


surgery and no surgery when the probability 
of operative death is 25%.!* When the proba- 
bility is lower, they should select surgery. When 
it is higher, they should select no surgery. 

The approach to sensitivity analyses we have 
described enables the analyst to understand 
how uncertainty in one, two, or three parame- 
ters affects the conclusions of an analysis. But 
in a complex problem, a decision tree or deci- 
sion model may have a 100 or more parameters. 
The analyst may have uncertainty about many 
parameters in a model. Probabilistic sensitivity 
analysis is an approach for understanding how 
the uncertainty in all (or a large number of) 
model parameters affects the conclusion of a 
decision analysis. To perform a probabilistic 
sensitivity analysis, the analyst must specify a 
probability distribution for each model param- 
eter. The analytic software then chooses a value 
for each model parameter randomly from the 


14 An operative mortality rate of 25% may seem high; 
however, this value is correct when we use QALYs as 
the basis for choosing treatment. A decision maker 
performing a more sophisticated analysis could use 
a utility function that reflects the patient’s aversion 
to risking death. 
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parameter’s probability distribution. The soft- 
ware then uses this set of parameter values and 
calculates the outcomes for each alternative. 
For each evaluation of the model, the software 
will determine which alternative is preferred. 
The process is usually repeated 10,000—100,000 
times. From the probabilistic sensitivity analy- 
sis, the analyst can determine the proportion of 
times an alternative is preferred, accounting for 
all uncertainty in model parameters simultane- 
ously. For more information on this advanced 
topic, see the article by Briggs and colleagues 
referenced at the end of the chapter. 


3.5.6 Representation of Long-Term 
Outcomes with Markov 
Models 


In » Example 3.12, we evaluated Mr. Danby’s 
decision to have surgery to improve his mobility, 
which was compromised by arthritis. We as- 
sumed that each of the possible outcomes (full 
mobility, poor mobility, death, etc.) would occur 
shortly after Mr. Danby took action on his deci- 
sion. But what if we want to model events that 
might occur in the distant future? For example, 
a patient with HIV infection might develop 
AIDS 10-15 years after infection; thus, a thera- 
py to prevent or delay the development of AIDS 
could affect events that occur 10-15 years, or 
more, in the future. A similar problem arises in 
analyses of decisions regarding many chronic 
diseases: we must model events that occur over 
the lifetime of the patient. The decision tree rep- 
resentation is convenient for decisions for which 
all outcomes occur during a short time horizon, 
but it is not always sufficient for problems that 
include events that could occur in the future. 
How can we include such events in a decision 
analysis? The answer is to use Markov models 
(Beck and Pauker 1983; Sonnenberg and Beck 
1993; Siebert et al. 2012). 

To build a Markov model, we first specify 
the set of health states that a person could expe- 
rience (e.g., Well, Cancer, and Death in B Fig. 
3.12). We then specify the transition probabili- 
ties, which are the probabilities that a person 
will transit from one of these health states to 
another during a specified time period. This pe- 


O Fig.3.12 A simple Markov model. The states of health 


that a person can experience are indicated by the circles; 
arrows represent allowed transitions between health states 


riod—often 1 month or 1 year—is the length of 
the Markov cycle. The Markov model then sim- 
ulates the transitions among health states for a 
person (or for a hypothetical cohort of people) 
for a specified number of cycles; by using a 
Markov model, we can calculate the probability 
that a person will be in each of the health states 
at any time in the future. As an illustration, con- 
sider a simple Markov model that has three 
health states: Well, Cancer, and Death (see B 
Fig. 3.12). We have specified each of the transi- 
tion probabilities in @Table 3.7 for the cycle 
length of 1 year. Thus, we note from B Table 3.7 
that a person who is in the well state will remain 
well with probability 0.9, will develop cancer 
with probability 0.06, and will die from non- 
cancer causes with probability 0.04 during 1 
year. The calculations for a Markov model are 
performed by computer software. Based on the 
transition probabilities in @Table 3.7, the prob- 
abilities that a person remains well, develops 
cancer, or dies from non-cancer causes over 
time is shown in Ø Table 3.8. We can also deter- 
mine from a Markov model the expected length 
of time that a person spends in each health 
state. Therefore, we can determine life expec- 
tancy, or quality-adjusted life expectancy, for 
any alternative represented by a Markov model. 

In decision analyses that represent long- 
term outcomes, the analysts will often use a 
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Markov model in conjunction with a decision 
tree to model the decision (Owens et al. 1995; 
Salpeter et al. 1997; Sanders et al. 2005; Lin et 
al. 2018). The analyst models the effect of an 
intervention as a change in the probability of 
going from one state to another. For example, 
we could model a cancer-prevention interven- 
tion (such as screening for breast cancer with 
mammography) as a reduction in the transition 
probability from Well to Cancer in QFig. 3.12. 
(See the articles by Beck and Pauker (1983) and 
Sonnenberg and Beck (1993) for further expla- 
nation of the use of Markov models). 


3.6 The Decision Whether to Treat, 
Test, or Do Nothing 


The clinician who is evaluating a patient’s 
symptoms and suspects a disease must choose 
among the following actions: 


O Table 3.7 Transition probabilities for the 
Markov model in @ Fig. 3.13 


Health state transition Annual probability 
Well to well 0.9 

Well to cancer 0.06 

Well to death 0.04 

Cancer to well 0.0 

Cancer to cancer 0.4 

Cancer to death 0.6 

Death to well 0.0 

Death to cancer 0.0 

Death to death 1.0 
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1. Do nothing further (neither perform addi- 
tional tests nor treat the patient). 

2. Obtain additional diagnostic information 
(test) before choosing whether to treat or 
do nothing. 

3. Treat without obtaining more informa- 
tion. 


When the clinician knows the patient’s true state, 
testing is unnecessary, and the doctor needs only 
to assess the trade-offs among therapeutic op- 
tions (as in > Example 3.12). Learning the pa- 
tient’s true state, however, may require costly, 
time-consuming, and often risky diagnostic pro- 
cedures that may give misleading FP or FN re- 
sults. Therefore, clinicians often are willing to 
treat a patient even when they are not absolutely 
certain about a patient’s true state. There are 
risks in this course: the clinician may withhold 
therapy from a person who has the disease of 
concern, or he may administer therapy to some- 
one who does not have the disease yet may suffer 
undesirable side effects of therapy. 

Deciding among treating, testing, and do- 
ing nothing sounds difficult, but you have al- 
ready learned all the principles that you need 
to solve this kind of problem. There are three 
steps: 

1. Determine the treatment threshold proba- 
bility of disease. 

2. Determine the pretest probability of dis- 
ease. 

3. Decide whether a test result could affect 
your decision to treat. 


The treatment threshold probability of disease 
is the probability of disease at which you 
should be indifferent between treating and not 
treating (Pauker and Kassirer 1980). Below 
the treatment threshold, you should not treat. 
Above the treatment threshold, you should 


© Table 3.8 Probability of future health states for the Markov model in Fig. 3.12 


Health state Probability of health state at end of year 
Year 1 Year 2 Year 3 Year 4 Year 5 Year 6 Year 7 
Well 0.9000 0.8100 0.7290 0.6561 0.5905 0.5314 0.4783 
Cancer 0.0600 0.0780 0.0798 0.0757 0.0696 0.0633 0.0572 
Death 0.0400 0.1120 0.1912 0.2682 0.3399 0.4053 0.4645 
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Treatment-threshold 
probability 


0 -< 


O Fig. 3.13 Depiction of the treatment threshold 
probability. At probabilities of disease that are less than 
the treatment threshold probability, the preferred action 


treat (BFig. 3.13). Whether to treat when the 
diagnosis is not certain is a problem that you 
can solve with a decision tree, such as the one 
shown in QFig. 3.14. 

You can use this tree to learn the treatment 
threshold probability of disease by leaving the 
probability of disease as an unknown, setting 
the expected value of surgery equal to the ex- 
pected value for medical (1.e., nonsurgical, such 
as drugs or physical therapy) treatment, and 
solving for the probability of disease. (In this 
example, surgery corresponds to the “treat” 
branch of the tree in @ Fig. 3.14, and nonsurgi- 
cal intervention corresponds to the “do not 
treat” branch). Because you are indifferent be- 
tween medical treatment and surgery at this 
probability, it is the treatment threshold prob- 
ability. Using the tree completes step 1. In prac- 
tice, people often determine the treatment 
threshold intuitively rather than analytically. 

An alternative approach to determination 
of the treatment threshold probability is to 
use the equation: 

H 


~ H+B’ 


p* 


where p* = the treatment threshold probability, 
H = the harm associated with treatment of a 
nondiseased patient, and B = the benefit asso- 
ciated with treatment of a diseased patient 
(Pauker and Kassirer 1980; Sox et al. 1988). We 
define B as the difference between the utility 
(U) of diseased patients who are treated and 


Probability of disease 


— > 10 


is to withhold therapy. At probabilities of disease that 
are greater than the treatment threshold probability, the 
preferred action is to treat 


Disease 
present 
U(D, treat) 
U(-D, treat) 
Disease 
absent 
Disease 
present U(D, do not 
treat) 
Do not treat 
U(-D, do not 
Disease treat) 
absent 


O Fig. 3.14 Decision tree with which to calculate the 
treatment threshold probability of disease. By setting 
the utilities of the treat and do not treat choices to be 
equal, we can compute the probability at which the clini- 
cian and patient should be indifferent to the choice. 
Recall that p [-D] = 1 — p [D] 


diseased patients who are not treated (U[D, 
treat] — U[D, do not treat], as shown in B Fig. 
3.14). The utility of diseased patients who are 
treated should be greater than that of diseased 
patients who are not treated; therefore, B is 
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positive. We define H as the difference in utility 
of nondiseased patients who are not treated 
and nondiseased patients who are treated 
(U[-D, do not treat] — U[-D, treat], as shown 
in GFig. 3.14). The utility of nondiseased pa- 
tients who are not treated should be greater 
than that of nondiseased patients who are 
treated; therefore, H is positive. The equation 
for the treatment threshold probability fits with 
our intuition: if the benefit of treatment is 
small and the harm of treatment is large, the 
treatment threshold probability will be high. In 
contrast, if the benefit of treatment is large and 
the harm of treatment is small, the treatment 
threshold probability will be low. 

Once you know the pretest probability, 
you know what to do in the absence of further 
information about the patient. If the pretest 
probability is below the treatment threshold, 
you should not treat the patient. If the pretest 
probability is above the threshold, you should 
treat the patient. Thus, you have completed 
step 2. 

One of the guiding principles of medical de- 
cision making is this: do not order a test unless it 
could change your management of the patient. 
In our framework for decision making, this 
principle means that you should order a test 
only if the test result could cause the probability 
of disease to cross the treatment threshold or 
lead to another test that would do so. Thus, if 
the pretest probability is above the treatment 
threshold, a negative test result must lead to a 
post-test probability that is below the threshold. 
Conversely, if the pretest probability is below the 
threshold probability, a positive result must lead 
to a post-test probability that is above the thresh- 
old. In either case, the test result would alter 
your decision of whether to treat the patient. 
This analysis completes step 3. 

To decide whether a test could alter man- 
agement, we simply use Bayes’ theorem. We 
calculate the post-test probability after a test 
result that would move the probability of dis- 
ease toward the treatment threshold. If the 
pretest probability is above the treatment 
threshold, we calculate the probability of dis- 
ease if the test result is negative. If the pretest 
probability is below the treatment threshold, 
we calculate the probability of disease if the 
test result is positive. 
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» Example 3.13 


You are a pulmonary medicine specialist. You 
suspect that a patient of yours has a pulmonary 
embolus (blood clot lodged in the vessels of the 
lungs). One approach is to do a computed tomog- 
raphy angiography (CTA) scan, a test in which a 
computed tomography (CT) of the lung is done 
after a radiopaque dye is injected into a vein. The 
dye flows into the vessels of the lung. The CT 
scan can then assess whether the blood vessels are 
blocked. If the scan is negative, you do no further 
tests and do not treat the patient. < 


To decide whether this strategy is correct, you 

take the following steps: 

1. Determine the treatment threshold proba- 
bility of pulmonary embolus. 

2. Estimate the pretest probability of pulmo- 
nary embolus. 

3. Decide whether a test result could affect 
your decision to treat for an embolus. 


First, assume you decide that the treatment 
threshold should be 0.10 in this patient. What 
does it mean to have a treatment threshold prob- 
ability equal to 0.10? If you could obtain no fur- 
ther information, you would treat for pulmonary 
embolus if the pretest probability was above 
0.10 (i.e., if you believed that there was greater 
than a 1 in 10 chance that the patient had an 
embolus), and would withhold therapy if the 
pretest probability was below 0.10. A decision to 
treat when the pretest probability is at the treat- 
ment threshold means that you are willing to 
treat nine patients without pulmonary embolus 
to be sure of treating one patient who has pul- 
monary embolus. A relatively low treatment 
threshold is justifiable because treatment of a 
pulmonary embolism with blood-thinning med- 
ication substantially reduces the high mortality 
of pulmonary embolism, whereas there is only a 
relatively small danger (mortality of less than 
1%) in treating someone who does not have pul- 
monary embolus. Because the benefit of treat- 
ment is high and the harm of treatment is low, 
the treatment threshold probability will be low, 
as discussed earlier. You have completed step 1. 

You estimate the pretest probability of 
pulmonary embolus to be 0.05, which is equal 
to a pretest odds of 0.053. Because the pretest 
probability is lower than the treatment thresh- 
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old, you should do nothing unless a positive 
CTA scan result could raise the probability of 
pulmonary embolus to above 0.10. You have 
completed step 2. 

To decide whether a test result could affect 
your decision to treat, you must decide whether 
a positive CTA scan result would raise the prob- 
ability of pulmonary embolism to more than 
0.10, the treatment threshold. You review the 
literature and learn that the LR for a positive 
CTA scan is approximately 21 (Stein et al. 2006). 

A negative CTA scan result will move the 
probability of disease away from the treatment 
threshold and will be of no help in deciding 
what to do. A positive result will move the 
probability of disease toward the treatment 
threshold and could alter your management 
decision if the post-test probability were above 
the treatment threshold. You therefore use the 
odds-ratio form of Bayes’ theorem to calculate 
the post-test probability of disease if the lung 
scan result is reported as high probability. 


post-test odds = pretest odds x LR 
= 0.053x 21=1.11. 


A post-test odds of 1.1 is equivalent to a prob- 
ability of disease of 0.53. Because the post- 
test probability of pulmonary embolus is 
higher than the treatment threshold, a posi- 
tive CTA scan result would change your man- 
agement of the patient, and you should order 
the lung scan. You have completed step 3. 

This example is especially useful for two 
reasons: first, it demonstrates one method for 
making decisions and second, it shows how 
the concepts that were introduced in this 
chapter all fit together in a clinical example of 
medical decision making. 


3.7 Alternative Graphical 
Representations for Decision 
Models: Influence Diagrams 
and Belief Networks 


In > Sects. 3.5 and 3.6, we used decision trees 
to represent decision problems. Although de- 
cision trees are the most common graphical 
representation for decision problems, influ- 
ence diagrams are an important alternative 


representation for such problems (Nease and 
Owens 1997; Owens et al. 1997). 

As shown in @Fig. 3.15, influence dia- 
grams have certain features that are similar to 
decision trees, but they also have additional 
graphical elements. Influence diagrams repre- 
sent decision nodes as squares and chance 
nodes as circles. In contrast to decision trees, 
however, the influence diagram also has arcs 


10.50 


Treat 


“HIV+” 75.46 


0.0968 


10.00 


No treat 


Obtain 73.50 
PCR HIV+ 

r 7 10.50 

Treat 


“Hvar 75.46 


0.9032 


g No treat ` 


10.00 


75.50 


Treat 0.08 1020 


75.46 


Do not obtain PCR 0.92 


10.00 


No treat 0.08 


75.50 


Obtain 
PCR? 
(Yes/NO) 


Treat? 
(Yes/No) 


O Fig. 3.15 A decision tree (top) and an influence dia- 
gram (bottom) that represent the decisions to test for, and to 
treat, HIV infection. The structural asymmetry of the alter- 
natives is explicit in the decision tree. The influence diagram 
highlights probabilistic relationships. HIV human immuno- 
deficiency virus, HIV+ HIV infected, HIV— not infected 
with HIV, QALE quality-adjusted life expectancy, PCR 
polymerase chain reaction. Test results are shown in quota- 
tion marks (“HIV+”), whereas the true disease state is 
shown without quotation marks (HIV+). (Source: Owens 
et al. (1997). Reproduced with permission) 
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between nodes and a diamond-shaped value 
node. An arc between two chance nodes indi- 
cates that a probabilistic relationship may exist 
between the chance nodes (Owens et al. 1997). 
A probabilistic relationship exists when the oc- 
currence of one chance event affects the prob- 
ability of the occurrence of another chance 
event. For example, in @ Fig. 3.15, the proba- 
bility of a positive or negative PCR test result 
(PCR result) depends on whether a person has 
HIV infection (HIV status); thus, these nodes 
have a probabilistic relationship, as indicated 
by the arc. The arc points from the conditioning 
event to the conditioned event (PCR test result 
is conditioned on HIV status in @ Fig. 3.15). 
The absence of an arc between two chance 
nodes, however, always indicates that the nodes 
are independent or conditionally independent. 
Two events are conditionally independent, giv- 
en a third event, if the occurrence of one of the 
events does not affect the probability of the 
other event conditioned on the occurrence of 
the third event. 

Unlike a decision tree, in which the events 
usually are represented from left to right in 
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the order in which the events are observed, in- 
fluence diagrams use arcs to indicate the tim- 
ing of events. An arc from a chance node to a 
decision node indicates that the chance event 
has been observed at the time the decision is 
made. Thus, the arc from PCR result to Treat? 
in QFig. 3.15 indicates that the decision mak- 
er knows the PCR test result (positive, nega- 
tive, or not obtained) when he or she decides 
whether to treat. Arcs between decision nodes 
indicate the timing of decisions: the arc points 
from an initial decision to subsequent deci- 
sions. Thus, in Ø Fig. 3.15, the decision maker 
must decide whether to obtain a PCR test be- 
fore deciding whether to treat, as indicated by 
the arc from Obtain PCR? to Treat? 

The probabilities and utilities that we need to 
determine the alternative with the highest ex- 
pected value are contained in tables associated 
with chance nodes and the value node (B Fig. 
3.16). These tables contain the same information 
that we would use in a decision tree. With a deci- 
sion tree, we can determine the expected value of 
each alternative by averaging out at chance 
nodes and folding back the tree (> Sect. 3.5.3). 


Probability of test results conditioned on 
disease status and decision to test 


"HIV+" "HIV—" 
Obtain PCR HIV+ 0.98 0.02 
HIV- 0.02 0.98 
Do not obtain PCR HIV+ 0.00 0.00 
HIV- 0.00 0.00 


Obtain 
PCR? 
(Yes/NO) 


Treat? 
(Yes/No) 


O Fig.3.16 The influence diagram from Ø Fig. 3.15, with 
the probability and value tables associated with the nodes. 
The information in these tables is the same as that associ- 
ated with the branches and endpoints of the decision tree 
in GFig. 3.15. HIV human immunodeficiency virus, HIV+ 
HIV infected, HIV— not infected with HIV, QALE quality- 


"NA" 
0.0 
0.0 
1.0 Prior probability of HIV 
1.0 HIV+ HIV- 
0.08 0.92 
Value table 
QALE 
HIV+, Tx+ 10.50 
HIV-, Tx- 10.00 
HIV+, Tx+ 75.46 
HIV—, Tx 75.50 


adjusted life expectancy, PCR polymerase chain reaction, 
NA not applicable, TX+ treated, TX— not treated. Test 
results are shown in quotation marks (“HIV+”), and the true 
disease state is shown without quotation marks (HIV+). 
(Source: Owens et al. (1997). Reproduced with permission) 
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For influence diagrams, the calculation of ex- 
pected value is more complex (Owens et al. 
1997), and generally must be performed with 
computer software. With the appropriate soft- 
ware, we can use influence diagrams to perform 
the same analyses that we would perform with a 
decision tree. Diagrams that have only chance 
nodes are called belief networks; we use them to 
perform probabilistic inference. 

Why use an influence diagram instead of a 
decision tree? Influence diagrams have both ad- 
vantages and limitations relative to decision 
trees. Influence diagrams represent graphically 
the probabilistic relationships among variables 
(Owens et al. 1997). Such representation is ad- 
vantageous for problems in which probabilistic 
conditioning is complex or in which communi- 
cation of such conditioning is important (such 
as may occur in large models). In an influence 
diagram, probabilistic conditioning is indicated 
by the arcs, and thus the conditioning is appar- 
ent immediately by inspection. In a decision 
tree, probabilistic conditioning is revealed by the 
probabilities in the branches of the tree. To de- 
termine whether events are conditionally inde- 
pendent in a decision tree requires that the 
analyst compare probabilities of events between 
branches of the tree. Influence diagrams also are 
particularly useful for discussion with content 
experts who can help to structure a problem but 
who are not familiar with decision analysis. In 
contrast, problems that have decision alterna- 
tives that are structurally different may be easier 
for people to understand when represented with 
a decision tree, because the tree shows the struc- 
tural differences explicitly, whereas the influence 
diagram does not. The choice of whether to use 
a decision tree or an influence diagram depends 
on the problem being analyzed, the experience 
of the analyst, the availability of software, and 
the purpose of the analysis. For selected prob- 
lems, influence diagrams provide a powerful 
graphical alternative to decision trees. 


3.8 Other Modeling Approaches 


We have described decision trees, Markov mod- 
els and influence diagrams. An analyst also can 
choose several other approaches to modeling. 
The choice of modeling approach depends on 


the problem and the objectives of the analysis. 
Although how to choose and design such mod- 
els is beyond our scope, we note other type of 
models that analysts use commonly for medical 
decision making. Microsimulation models are 
individual-level health state transition models, 
similar to Markov models, that provide a means 
to model very complex events flexibly over 
time. They are useful when the clinical history 
of a problem is complex, such as might occur 
with cancer, heart disease, and other chronic 
diseases. They are also useful for modeling indi- 
vidual heterogeneity which may depend on 
combinations of individual characteristics (e.g. 
heterogeneity of response to treatment based 
on medical conditions or genetics). Dynamic 
transmission models are particularly well-suited 
for assessing the outcomes of infectious diseas- 
es. These models divide a population into com- 
partments (for example, uninfected, infected, 
recovered, dead), and transitions between com- 
partments are governed by differential or dif- 
ference equations. The rate of transition 
between compartments depends in part on the 
number of individuals in the compartment, an 
important feature for infectious diseases in 
which the transmission may depend on the 
number of infected or susceptible individuals. 
Discrete event simulation models also are often 
used to model interactions between people. 
These models are composed of entities (a pa- 
tient) that have attributes (clinical history), and 
that experience events (a heart attack). An en- 
tity can interact with other entities and use re- 
sources. Discrete event simulation models are 
also used when considering scarce resources 
such as queues for a diagnostic test or an oper- 
ating room slot. For more information on these 
types of models, we suggest a recent series of 
papers on best modeling practices; the paper by 
Caro and colleagues noted in the suggested 
readings at the end of the chapter is an over- 
view of this series of papers. 


3.9 The Role of Probability and 
Decision Analysis in Medicine 


You may be wondering how probability and de- 
cision analysis might be integrated smoothly 
into medical practice. An understanding of 
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probability and measures of test performance 
will prevent any number of misadventures. In > 
Example 3.1, we discussed a hypothetical test 
that, on casual inspection, appeared to be an ac- 
curate way to screen blood donors for previous 
exposure to the AIDS virus. Our quantitative 
analysis, however, revealed that the hypothetical 
test results were misleading more often than 
they were helpful because of the low prevalence 
of HIV in the clinically relevant population. 
Fortunately, in actual practice, much more ac- 
curate tests are used to screen for HIV. 

The need for knowledgeable interpretation 
of test results is widespread. The federal gov- 
ernment screens civil employees in “sensitive” 
positions for drug use, as do many companies. 
If the drug test used by an employer had a 
sensitivity and specificity of 0.95, and if 10% 
of the employees used drugs, one-third of the 
positive tests would be FPs. An understanding 
of these issues should be of great interest to 
the public, and health professionals should be 
prepared to answer the questions of their pa- 
tients. 

Although we should try to interpret every 
kind of test result accurately, decision analysis 
has a more selective role in medicine. Not all 
clinical decisions require decision analysis. 
Some decisions depend on physiologic princi- 
ples or on deductive reasoning. Other deci- 
sions involve little uncertainty. Nonetheless, 
many decisions must be based on imperfect 
data, and they will have outcomes that cannot 
be known with certainty at the time that the 
decision is made. Decision analysis provides a 
technique for managing these situations. 

For many problems, simply drawing a tree 
that denotes the possible outcomes explicitly 
will clarify the question sufficiently to allow 
you to make a decision. When time is limited, 
even a “quick and dirty” analysis may be help- 
ful. By using expert clinicians’ subjective 
probability estimates and asking what the pa- 
tient’s utilities might be, you can perform an 
analysis quickly and learn which probabilities 
and utilities are the important determinants 
of the decision. 

Health care professionals sometimes ex- 
press reservations about decision analysis be- 
cause the analysis may depend on probabilities 
that must be estimated, such as the pretest 
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probability. A thoughtful decision maker will 
be concerned that the estimate may be in er- 
ror, particularly because the information 
needed to make the estimate often is difficult 
to obtain from the medical literature. We ar- 
gue, however, that uncertainty in the clinical 
data is a problem for any decision-making 
method and that the effect of this uncertainty 
is explicit with decision analysis. The method 
for evaluating uncertainty is sensitivity analy- 
sis: we can examine any variable to see wheth- 
er its value is critical to the final recommended 
decision. Thus, we can determine, for exam- 
ple, whether a change in pretest probability 
from 0.6 to 0.8 makes a difference in the final 
decision. In so doing, we often discover that it 
is necessary to estimate only a range of prob- 
abilities for a particular variable rather than a 
precise value. Thus, with a sensitivity analysis, 
we can decide whether uncertainty about a 
particular variable should concern us. 

The growing complexity of medical deci- 
sions, coupled with the need to control costs, 
has led to major programs to develop clinical 
practice guidelines. Decision models have 
many advantages as aids to guideline develop- 
ment (Eddy 1992; Habbema et al. 2014; Owens 
et al. 2016): they make explicit the alternative 
interventions, associated uncertainties, and 
utilities of potential outcomes. Decision mod- 
els can help guideline developers to structure 
guideline-development problems (Owens and 
Nease 1993), to incorporate patients’ prefer- 
ences (Nease and Owens 1994; Owens 1998), 
and to tailor guidelines for specific clinical 
populations (Owens and Nease 1997). The 
U.S. Preventive Services Task Force, which de- 
velops national prevention guidelines, has 
used decision models in the development of 
guidelines on breast, lung, cervical, and 
colorectal cancer screening. In addition, Web- 
based interfaces for decision models can pro- 
vide distributed decision support for guideline 
developers and users by making the decision 
model available for analysis to anyone who 
has access to the Web (Sanders et al. 1999). 

We have not emphasized computers in 
this chapter, although they can simplify many 
aspects of decision analysis (see >» Chap. 24). 
MEDLINE and other bibliographic retrieval 
systems (see ® Chap. 23) make it easier to ob- 
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tain published estimates of disease preva- 
lence and test performance. Computer 
programs for performing statistical analyses 
can be used on data collected by hospital in- 
formation systems. Decision analysis soft- 
ware, available for personal computers, can 
help clinicians to structure decision trees, to 
calculate expected values, and to perform 
sensitivity analyses. Researchers continue to 
explore methods for computer-based auto- 
mated development of practice guidelines 
from decision models and use of computer- 
based systems to implement guidelines 
(Musen et al. 1996). With the growing matu- 
rity of this field, there are now companies 
that offer formal analytical tools to assist 
with clinical outcome assessment and inter- 
pretation of population datasets. 

Medical decision making often involves 
uncertainty for the clinician and risk for the 
patient. Most health care professionals would 
welcome tools that help them make decisions 
when they are confronted with complex clini- 
cal problems with uncertain outcomes. There 
are important medical problems for which de- 
cision analysis offers such aid. 


3.10 Appendix A: Derivation of 
Bayes’ Theorem 


Bayes’ theorem is derived as follows. We de- 
note the conditional probability of disease, D, 
given a test result, R, p[D|R]. The prior (pre- 
test) probability of D is p[D]. The definition 
of conditional probability is: 
p[R,D] 

p[R] 
The probability of a test result (p[R]) is the 
sum of its probability in diseased patients and 
its probability in nondiseased patients: 

p[R]=p [R.D] + p[R,-D]. 

Substituting into Eq. 3.1, we obtain: 


p[R.D] 
p[R,D]+ p[R,-D] 


p[DIR]= 


(3.1) 


p[DIR]= 


(3.2) 


Again, from the definition of conditional 
probability, 


p[R.D] plR,-D] 
plD] p[-D] 
These expressions can be rearranged: 

p[R.D} = p[D]x p[R| D], 


p[R,-D]= p[-D]x p[R |-D]. 


p[RID] = and p[R|-D]= 
(3.3) 
(3.4) 
Substituting Eqs. 3.3 and 3.4 into Eq. 3.2, we 


obtain Bayes’ theorem: 


F p|D]xp[R|D] 
p[DIR] p[D]x p[RID]+ p[-D]x p[RI-D] 
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Q Questions for Discussion 
1. Calculate the following probabilities 
for a patient about to undergo CABG 

surgery (see > Example 3.2): 

(a) The only possible, mutually exclu- 
sive outcomes of surgery are death, 
relief of symptoms (angina and dys- 
pnea), and continuation of symp- 
toms. The probability of death is 
0.02, and the probability of relief of 
symptoms is 0.80. What is the prob- 
ability that the patient will continue 
to have symptoms? 


(b) 


(c) 
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Two known complications of heart 
surgery are stroke and heart attack, 
with probabilities of 0.02 and 0.05, 
respectively. The patient asks what 
chance he or she has of having both 
complications. Assume that the com- 
plications are conditionally indepen- 
dent, and calculate your answer. 

The patient wants to know the 
probability that he or she will 
have a stroke given that he or she 
has a heart attack as a complica- 
tion of the surgery. Assume that 1 
in 500 patients has both compli- 
cations, that the probability of 
heart attack is 0.05, and that the 
events are independent. Calculate 
your answer. 


2. The results of a hypothetical study to 
measure test performance of a diag- 
nostic test for HIV are shown in the 2 
x 2 table in @Table 3.9. 


(a) 


(b) 


Calculate the sensitivity, specific- 
ity, disease prevalence, PV+, and 
PV-. 

Use the TPR and TNR calculated 
in part (a) to fill in the 2 x 2 table 
in @ Table 3.10. Calculate the dis- 
ease prevalence, PV+, and PV-. 


O Table 3.9 A 2 x 2 contingency table for the 
hypothetical study in problem 2 


PCR test 
result 


Positive 
PCR 


Negative 
PCR 


Total 


Gold Goldstandard Total 
standard test negative 

test 

positive 

48 8 56 

2 47 49 

50 55 105 


PCR polymerase chain reaction 


118 


D. K. Owens et al. 


O Table 3.10 A 2 x 2 contingency table to 
complete for problem 2b 


PCR test Gold Gold Total 
result standard standard 

test positive test negative 
Positive X K X 
PCR 
Negative 100 99,900 x 
PCR 
Total x K X 


PCR polymerase chain reaction 
x quantities that the question ask students to cal- 
culate 


You are asked to interpret the results 
from a diagnostic test for HIV in an as- 
ymptomatic patient whose test was 
positive when the patient volunteered 
to donate blood. After taking the pa- 
tient’s history, you learn that the pa- 
tient has a history of intravenous-drug 
use. You know that the overall preva- 
lence of HIV infection in your commu- 
nity is 1 in 500 and that the prevalence 

in people who have injected drugs is 20 

times as high as in the community at 

large. 

(a) Estimate the pretest probability 
that this patient is infected with 
HIV. 

(b) The patient tells you that two 
people with whom the patient 
shared needles subsequently died 
of AIDS. Which heuristic will be 
useful in making a subjective ad- 
justment to the pretest probabili- 
ty in part (a)? 

(c) Use the sensitivity and specificity 
that you worked out in 2(a) to cal- 
culate the post-test probability of 
the patient having HIV after a pos- 
itive and negative test. Assume that 
the pretest probability is 0.10. 

(d) If you wanted to increase the post- 
test probability of disease given a 


positive test result, would you 
change the TPR or TNR of the test? 


You have a patient with cancer who has 
a choice between surgery or chemother- 
apy. If the patient chooses surgery, he or 
she has a 2% chance of dying from the 
operation (life expectancy = 0), a 50% 
chance of being cured (life expectancy = 
15 years), and a 48% chance of not be- 
ing cured (life expectancy = 1 year). If 
the patient chooses chemotherapy, he or 
she has a 5% chance of death (life expec- 
tancy = 0), a 65% chance of cure (life 
expectancy = 15 years), and a 30% 
chance that the cancer will be slowed but 
not cured (life expectancy = 2 years). 
Create a decision tree. Calculate the ex- 
pected value of each option in terms of 
life expectancy. 

You are concerned that a patient with a 
sore throat has a bacterial infection that 
would require antibiotic therapy (as op- 
posed to a viral infection, for which no 
treatment is available). Your treatment 
threshold is 0.4, and based on the ex- 
amination you estimate the probability 
of bacterial infection as 0.8. A test is 
available (TPR = 0.75, TNR = 0.85) that 
indicates the presence or absence of bac- 
terial infection. Should you perform the 
test? Explain your reasoning. How 
would your analysis change if the test 
were extremely costly or involved a sig- 
nificant risk to the patient? 

What are the three kinds of bias that can 
influence measurement of test perfor- 
mance? Explain what each one is, and 
state how you would adjust the post-test 
probability to compensate for each. 
How could a computer system ease the 
task of performing a complex decision 
analysis? 

When you search the medical literature 
to find probabilities for patients similar 
to one you are treating, what is the 
most important question to consider? 
How should you adjust probabilities in 
light of the answer to this question? 


Biomedical Decision Making: Probabilistic Clinical Reasoning 


9. Why do you think clinicians some- 
times order tests even if the results will 
not affect their management of the pa- 
tient? Do you think the reasons that 
you identify are valid? Are they valid 
in only certain situations? Explain 
your answers. See the January 1998 is- 
sue of Medical Decision Making for 
articles that discuss this question. 

10. Explain the differences in three ap- 
proaches to assessing patients’ prefer- 
ences for health states: the standard 
gamble, the time trade-off, and the 
visual analog scale. 


Disclaimer The views presented are solely the 
responsibility of the authors and do not necessarily 
represent the views of the Patient-Centered Outcomes 
Research Institute (PCORI), its Board of Governors, 
or its Methodology Committee. 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= How can cognitive science theory 
meaningfully inform and shape the 
design, development, and assessment of 
healthcare information systems? 

= How is cognitive science different from 
behavioral science? 

= What are some of the ways in which we 
can characterize the structure of 
knowledge? 

= What are some of the dimensions of 
difference between experts and novices? 

= Why is it important to consider 
cognition and human factors in dealing 
with issues of patient safety? 

= How does distributed cognition differ 


from other theories of human 
cognition? 
4.1 Introduction 


Enormous advances in health information 
technologies and more generally, in com- 
puting over the past several decades have 
begun to permeate diverse facets of clini- 
cal practice. The rapid pace of technological 
developments such as the Internet, wireless 
technologies, and mobile devices, in the last 
decade, affords significant opportunities for 
supporting, enhancing and extending user 
experiences, interactions and communica- 
tions (Rogers 2004). These advances, coupled 
with a growing computer literacy among 
healthcare professionals, afford the poten- 
tial for great improvements in healthcare. 
Yet many observers note that the healthcare 
system is slow to understand information 
technology and effectively incorporate it into 
the work environment (Shortliffe and Blois 
2001; Karsh et al. 2010; Harrington 2015). 
Innovative technologies often produce pro- 
found cultural, social, and cognitive changes. 
These transformations necessitate adaptation 
at many different levels of aggregation from 
the individual to the larger institution, often 
causing disruptions of workflow and user dis- 
satisfaction (Bloomrosen et al. 2011). 


Similar to other complex domains, bio- 
medical information systems embody ideals in 
design that often do not readily yield practical 
solutions in implementation. As computer- 
based systems infiltrate clinical practice and 
settings, the consequences often can be felt 
through all levels of the organization. This 
impact can have deleterious effects resulting in 
systemic inefficiencies and suboptimal prac- 
tice, which can lead to frustrated healthcare 
practitioners, unnecessary delays in health- 
care delivery, and even adverse events (Lin 
et al. 1998; Weinger and Slagle 2001). How 
can we manage change? How can we intro- 
duce systems that are designed to be more 
intuitive and also implemented efficiently to 
be confluent with everyday practice without 
compromising safety? 


4.1.1 Introducing Cognitive Science 


Cognitive science is a multidisciplinary 
domain of inquiry devoted to the study of 
cognition and its role in intelligent agency. 
The primary disciplines include cognitive psy- 
chology, artificial intelligence, neuroscience, 
linguistics, anthropology, and philosophy. 
From the perspective of informatics, cogni- 
tive science can provide a framework for the 
analysis and modeling of complex human 
performance in technology-mediated settings. 
Cognitive science incorporates basic science 
research focusing on fundamental aspects of 
cognition (e.g., attention, memory, reason- 
ing, early language acquisition) as well as 
applied research. Applied cognitive research 
is focally concerned with the development 
and evaluation of useful and usable cogni- 
tive artifacts. Cognitive artifacts are human- 
made materials, devices, and systems that 
extend people’s abilities in perceiving objects, 
encoding and retrieving information from 
memory, and problem-solving (Gillan and 
Schvaneveldt 1999). In this regard, applied 
cognitive research is closely aligned with the 
disciplines of human-computer interaction 
(HCI) and human factors. In everyday life, we 
interact with cognitive artifacts to receive and 
manipulate information to alter our thinking 
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processes and offload effort-intensive cogni- 
tive activity to the external world, thereby 
reducing mental workload. 

The past three decades have produced a 
cumulative body of experiential and prac- 
tical knowledge about system design and 
implementation that guide future initiatives. 
This practical knowledge embodies the need 
for sensible and intuitive user interfaces, an 
understanding of workflow, and the ways in 
which systems impact individual and team 
performance. However, experiential knowl- 
edge in the form of anecdotes and case studies 
is inadequate for producing robust generaliza- 
tions or sound design and implementation 
principles. There is a need for a theoretical 
foundation. Biomedical informatics is more 
than the thin intersection of biomedicine and 
computing (Patel and Kaufman 1998). There 
is a growing role for the social sciences, includ- 
ing the cognitive and behavioral sciences, in 
biomedical informatics, particularly as they 
pertain to human-computer interaction and 
other areas such as information retrieval and 
decision support (Patel et al. 2017). In this 
chapter, we focus on the foundational role of 
cognitive science in biomedical informatics 
research and practice. Theories and methods 
from the cognitive sciences can illuminate dif- 
ferent facets of design and implementation of 
information and knowledge-based systems. 
They can also play a larger role in character- 
izing and enhancing human performance on 
a wide range of tasks involving clinicians, 
patients, and healthy consumers of biomedi- 
cal information. These tasks may include 
developing training programs and devising 
measures to reduce errors or increase effi- 
ciency. In this respect, cognitive science repre- 
sents one of the basic component sciences of 
biomedical informatics (Shortliffe and Blois 
2001; Patel and Kaufman 1998). 


4.1.2 Cognitive Science 
and Biomedical Informatics 


How can cognitive science theory meaning- 
fully inform and shape design, development, 
and assessment of health-care information 


123 


systems? Cognitive science provides insight 
into principles of system usability and learn- 
ability, the mediating role of technology in 
clinical performance, the process of medical 
judgment and decision-making, the train- 
ing of healthcare professionals, patients, and 
health consumers, and the design of a safer 
workplace. The central argument is that it can 
inform our understanding of human perfor- 
mance in technology-rich healthcare environ- 
ments (Carayon 2012; Patel et al. 2013b). 

Precisely how will cognitive science theory 
and methods make a significant contribution 
towards these important objectives? The trans- 
lation of research findings from one discipline 
into practical concerns that can be applied to 
another is rarely a straight-forward process 
(Rogers 2004). Furthermore, even when sci- 
entific knowledge is highly relevant in prin- 
ciple, making that knowledge actionable in a 
design context can be a significant challenge. 
In this chapter, we discuss (a) basic cognitive 
science research and theories that provide a 
foundation for understanding the underly- 
ing mechanisms guiding human performance 
(e.g., findings pertaining to the structure of 
human memory), and (b) research in the areas 
of medical errors and patient safety as they 
interact with health information technology), 

As illustrated in @ Table 4.1, there are 
correspondences between basic cognitive sci- 
ence research, medical cognition and cogni- 
tive research in biomedical informatics along 
several dimensions. For example, theories of 
human memory and knowledge organization 
lend themselves to characterizations of expert 
clinical knowledge that can then be contrasted 
with the representations of such knowledge 
in clinical systems. Similarly, research in text 
comprehension has provided a theoretical 
framework for research in understanding 
biomedical texts. Additionally, theories of 
problem solving can be used to understand 
the processes and knowledge associated with 
diagnostic and therapeutic reasoning. This 
understanding provides a basis for develop- 
ing medical artificial intelligence and decision 
support systems. 

Cognitive research, theories, and methods 
can contribute to applications in informatics 
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O Table 4.1 
in medical informatics 


Cognitive Science 


Knowledge organization and 
human memory 


Problem solving, Heuristics/ 
reasoning strategies 


Perception/attention 


Text comprehension 


Conversational analysis 


Distributed cognition 


Medical Cognition 


Organization of clinical and 
basic science knowledge 


Medical problem solving and 
decision making 


Radiologic and dermatologic 
diagnosis 


Understanding medical texts 
Knowledge representation 


Medical discourse 


Collaborative practice and 


Correspondences between cognitive science, medical cognition and applied cognitive research 


Biomedical Informatics 


Development and use of medical 
knowledge bases 


Medical artificial intelligence/decision 
support systems/medical errors 


Medical imaging systems 


Information retrieval/digital libraries/ 
health literacy 


Medical natural language processing 


Computer-based provider order entry 


research in health care 


Coordination of theory and 


evidence reasoning 


Diagrammatic reasoning 
data displays 


in a number of ways including: (1) seed basic 
research findings that can illuminate dimen- 
sions of design (e.g., attention and memory, 
aspects of the visual system), (2) provide an 
explanatory vocabulary for characterizing how 
individuals process and communicate health 
information (e.g., various studies of medical 
cognition pertaining to doctor-patient inter- 
action), (3) present an analytic framework for 
identifying problems and modeling certain 
kinds of user interactions, (4) characterize the 
relationship between health information tech- 
nology, human factors and patient safety, (5) 
provide rich descriptive accounts of clinicians 
employing technologies in the context of 
work, and (6) furnish a generative approach 
for novel designs and productive applied 
research programs in informatics (e.g., inter- 
vention strategies for supporting low literacy 
populations in health information seeking). 
Based on a review of articles published 
in the Journal of Biomedical Informatics 
between January 2001 and March 2014, Patel 
and Kannampallil (2015) identified 57 articles 
that focused on topics related to cognitive 
informatics. The topics ranged from char- 
acterizing the limits of clinician problem- 
solving and reasoning behavior, to describing 


Diagnostic and therapeutic 


Perceptual processing of patient 


systems 


Evidence-based clinical guidelines 


Biomedical information visualization 


coordination and communication patterns of 
distributed clinical teams, to developing sus- 
tainable and cognitively plausible interven- 
tions for supporting clinician activities. 

The social sciences are constituted by mul- 
tiple frameworks and approaches. Behaviorism 
constitutes a framework for analyzing and 
modifying behavior. It is an approach that 
has had an enormous influence on the social 
sciences. Cognitive science partially emerged 
as a response to the limitations of behavior- 
ism. The next section of the chapter contains 
a brief history of the cognitive and behavioral 
sciences that emphasizes the points of dif- 
ference between the two approaches. It also 
serves to introduce basic concepts in the study 
of cognition. 


4.2 Cognitive Science: 
The Emergence of an 
Explanatory Framework 


In this section, we sketch a brief history of the 
emergence of cognitive science in view to dif- 
ferentiate it with competing theoretical frame- 
works in the social sciences. The section also 
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serves to introduce core concepts that consti- 
tute an explanatory framework for cognitive 
science. 

Behaviorism is the conceptual framework 
underlying a particular science of behavior 
(Zuriff 1985). It is not to be confused with 
the term behavioral science which names a 
large body of work across disciplines, but not 
a specific theoretical framework. Behaviorism 
dominated experimental and applied psychol- 
ogy as well as the social sciences for the better 
part of the twentieth century (Bechtel et al. 
1998). Behaviorism represented an attempt 
to develop an objective, empirically-based 
science of behavior and more specifically, 
learning. Empiricism is the view that experi- 
ence is the only source of knowledge (Hilgard 
and Bower 1975). Behaviorism endeavored to 
build a comprehensive framework of scien- 
tific inquiry around the experimental analysis 
of observable behavior. Behaviorists eschewed 
the study of thinking as an unacceptable psy- 
chological method because it was inherently 
subjective, error-prone, and could not be sub- 
jected to empirical validation. Similarly, hypo- 
thetical constructs (e.g., mental processes as 
mechanisms in a theory) were discouraged. 
All constructs had to be specified in terms 
of operational definitions, so they could be 
manipulated, measured, and quantified for 
empirical investigation (Weinger and Slagle 
2001). 

For reasons that go beyond the scope of 
this chapter, classical behavioral theories have 
been largely discredited as a comprehensive 
unifying theory of behavior. However, behav- 
iorism continues to provide a theoretical and 
methodological foundation in a wide range 
of social science disciplines. For example, 
behaviorist tenets continue to play a central 
role in public health research. In particular, 
health behavior research emphasizes anteced- 
ent variables and environmental contingencies 
that serve to sustain unhealthy behaviors such 
as smoking (Sussman 2001). Around 1950, 
there was increasing dissatisfaction with the 
limitations and methodological constraints 
(e.g., the disavowal of the unobserved such 
as mental states) of behaviorism. In addition, 
developments in logic, information theory, 
cybernetics, and perhaps most importantly, 


125 


the advent of the digital computer, aroused 
substantial interest in “information process- 
ing” (Gardner 1985). 

Cognitive scientists placed “thought” 
and “mental processes” at the center of their 
explanatory framework. The “computer met- 
aphor” provided a framework for the study 
of human cognition as the manipulation of 
“symbolic structures.” It also provided the 
foundation for a model of memory, which 
was a prerequisite for an information process- 
ing theory (Atkinson and Shiffrin 1968). The 
implementation of models of human perfor- 
mance as computer programs provided a mea- 
sure of objectivity and a sufficiency test of a 
theory and also served to increase the objec- 
tivity of the study of mental processes (Estes 
1975). 

Arguably, the landmark publication in the 
nascent field of cognitive science is Newell and 
Simon’s “Human Problem Solving” (Newell 
and Simon 1972). This was the culmination 
of over 15 years of work on problem solv- 
ing and research in artificial intelligence. It 
was a mature thesis that described a theoreti- 
cal framework, extended a language for the 
study of cognition, and introduced protocol- 
analytic methods that have become ubiquitous 
in the study of high-level cognition. It laid the 
foundation for the formal investigation of 
symbolic-information processing (more spe- 
cifically, problem solving). The development 
of models of human information processing 
also provided a foundation for the discipline 
of human-computer interaction and the first 
formal methods of analysis (Card et al. 1983). 

The early investigations of problem solv- 
ing focused primarily on investigations of 
experimentally contrived or toy-world tasks 
such as elementary deductive logic, the Tower 
of Hanoi, illustrated in B Fig. 4.1, and math- 
ematical word problems (Greeno and Simon 
1988). These tasks required very little back- 
ground knowledge and were well structured, 
in the sense that all the variables necessary 
for solving the problem were present in the 
problem statement. These tasks allowed for a 
complete description of the task environment, 
a step-by-step description of the sequential 
behavior of the subjects’ performance, and 
the modeling of subjects’ cognitive and overt 
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behavior in the form of a computer simula- 
tion. The Tower of Hanoi, in particular, served 
as an important test bed for the development 
of an explanatory vocabulary and framework 
for analyzing problem-solving behavior. 

The Tower of Hanoi (TOH) is a relatively 
straight-forward task that consists of three 
pegs (A, B, and C) and three or more disks 
that vary in size. The goal is to move the three 
disks from peg A to peg C one at a time with 
the constraint that a larger disk can never rest 
on a smaller one. Problem solving can be con- 
strued as search in a problem space. A prob- 
lem space has an initial state, a goal state, and 
a set of operators. Operators are any moves 
that transform a given state to a successor 
state. For example, the first move could be to 
move the small disk to peg B or peg C. Ina 
three-disk TOH, there are a total of 27 pos- 
sible states representing the complete problem 
space. TOH has 3” states where n is the num- 
ber of disks. The minimum number of moves 
necessary to solve a TOH is 2°-!. Problem 
solvers will typically maintain only a small set 
of states at a time. 

The search process involves finding a solu- 
tion strategy that will minimize the number 
of steps. The metaphor of movement through 
a problem space provides a means for under- 
standing how an individual can sequentially 
address the challenges they confront at each 
stage of a problem and the actions that ensue. 
We can characterize the problem-solving 
behavior of the subject at a local level in terms 
of state transitions or at a more global level 
in terms of strategies. For example, means- 
ends analysis is a commonly used strategy 
for reducing the difference between the start 
state and goal state. For instance, moving all 
but the largest disk from peg A to peg B is 
an interim goal associated with such a strat- 
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Tower of Hanoi task illustrating a start state and a goal state 


egy. Although TOH bears little resemblance 
to the tasks performed by either clinicians or 
patients, the example illustrates the process 
of analyzing task demands and task perfor- 
mance in human subjects. The TOH helped 
lay the groundwork for cognitive task analyses 
that are performed today. 

Protocol analysis! is among the most 
commonly used methods (Newell and Simon 
1972). Protocol analysis refers to a class of 
techniques for representing verbal think-aloud 
protocols (Greeno and Simon 1988). Think- 
aloud protocols are the most common source 
of data used in studies of problem solving. 
In these studies, subjects are instructed to 
verbalize their thoughts as they perform an 
experimental task. Ericsson and Simon (1993) 
specify the conditions under which verbal 
reports are acceptable as legitimate data. For 
example, retrospective think-aloud protocols 
are viewed as somewhat suspect because the 
subject has had the opportunity to recon- 
struct the information in memory, and the 
verbal reports are inevitably distorted. Think- 
aloud protocols recorded in concert with 
observable behavioral data such as a subject’s 
actions provide a rich source of evidence to 
characterize cognitive processes. 

Cognitive psychologists and linguists have 
investigated the processes and properties of 
language and memory in adults and children 
for many decades. Early research focused on 
basic laboratory studies of list learning or 
processing of words and sentences (as in a 
sentence completion task) (Anderson 1985). 


1 The term protocol refers to that which is produced 
by a subject during testing (e.g., a verbal record). It 
differs from the more common use of protocol as 
defining a code or set of procedures governing 
behavior or a situation. 
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van Dijk and Kintsch (1983) developed an 
influential method of analyzing the process of 
text comprehension based on the realization 
that text can be described at multiple levels 
from surface codes (e.g., words and syntax) to 
a deeper level of semantics. Comprehension 
refers to cognitive processes associated with 
understanding or deriving meaning from 
text, conversation, or other informational 
resources. It involves the processes that people 
use when trying to make sense of a piece of 
text, such as a sentence, a book, or a verbal 
utterance. It also involves the final product of 
such processes, which is, the mental represen- 
tation of the text, essentially what people have 
understood. 

Comprehension may often precede prob- 
lem solving and decision making but is also 
dependent on perceptual processes that focus 
attention, the availability of relevant knowl- 
edge, and the ability to deploy knowledge in 
a given context. Some of the more impor- 
tant differences in medical problem solving 
and decision making arise from differences in 
knowledge and comprehension. Furthermore, 
many of the problems associated with deci- 
sion making are the result of either a lack of 
knowledge or failure to understand the infor- 
mation appropriately. 

The early investigations provided a well- 
constrained artificial environment for the 
development of the basic methods and prin- 
ciples of problem solving. They also provide 
a rich explanatory vocabulary (e.g., prob- 
lem space), but were not fully adequate in 
accounting for cognition in knowledge-rich 
domains of greater complexity and involv- 
ing uncertainty. In the mid to late 1970s, there 
was a shift in research to complex “real-life” 
knowledge-based domains of inquiry (Greeno 
and Simon 1988). Problem-solving research 
was studying performance in domains such as 
physics (Larkin et al. 1980), medical diagno- 
ses (Elstein et al. 1978) and architecture (Akin 
1982). Similarly, the study of text comprehen- 
sion shifted from research on simple stories 
to technical and scientific texts in a range 
of domains, including medicine. This paral- 
leled a similar change in artificial intelligence 
research from “toy programs” to addressing 
“real-world” problems and the development 
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of expert systems (Clancey and Shortliffe 
1984). The shift to real-world problems in 
cognitive science was spearheaded by research 
exploring the nature of expertise. Most of 
the early investigations on expertise involved 
laboratory experiments. However, the shift 
to knowledge-intensive domains provided a 
theoretical and methodological foundation 
to conduct both basic and applied research 
in real-world settings such as the workplace 
(Vicente 1999) and the classroom (Bruer 
1993). These areas of application provided a 
fertile test bed for assessing and extending the 
cognitive science framework. 

In recent years, the conventional 
information-processing approach has come 
under criticism for its narrow focus on the 
rational/cognitive processes of the solitary 
individual. One of the most compelling pro- 
posals has to do with a shift from viewing cog- 
nition as a property of the solitary individual 
to viewing cognition as distributed across 
groups, cultures, and artifacts. This claim has 
significant implications for the study of col- 
laborative endeavors and human-computer 
interaction. We explore the concepts underly- 
ing distributed cognition in greater detail in a 
subsequent section. 


Human Information 
Processing 


4.3 


It is well known that product design often fails 
to consider cognitive and physiological con- 
straints adequately and imposes an unneces- 
sary burden on task performance (Sharp etal. 
2019). Fortunately, advances in theory and 
methods provide us with greater insight into 
designing systems for the human condition. 
Cognitive science serves as a basic science 
and provides a framework for the analysis and 
modeling of complex human performance. A 
computational theory of mind provides the 
fundamental underpinning for most contem- 
porary theories of cognitive science. The basic 
premise is that much of human cognition can 
be characterized as a series of operations or 
computations on mental representations. 
Mental representations are internal cognitive 
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states that have a certain correspondence with 
the external world. For example, they may 
reflect a clinician’s hypothesis about a patient’s 
condition after noticing an abnormal gait as 
he entered the clinic. These are likely to elicit 
further inferences about the patient’s underly- 
ing condition and may direct the physician’s 
information-gathering strategies and contrib- 
ute to an evolving problem representation. 

Two interdependent dimensions by which 
we can characterize cognitive systems are (1) 
architectural theories that endeavor to pro- 
vide a unified theory for all aspects of cogni- 
tion and (2) the different kinds of knowledge 
necessary to attain competency in a given 
domain. Individuals differ substantially 
in terms of their knowledge, experiences, 
and endowed capabilities. The architectural 
approach capitalizes on the fact that we can 
characterize certain regularities of the human 
information-processing system. These can 
be either structural regularities—such as the 
existence of and the relations between percep- 
tual, attentional, and memory systems and 
memory capacity limitations—or processing 
regularities, such as processing speed, selec- 
tive attention, or problem-solving strategies. 
Cognitive systems are characterized function- 
ally in terms of the capabilities they enable 
(e.g., focused attention on selective visual 
features), the way they constrain human 
cognitive performance (e.g., limitations on 
memory), and their development during the 
lifespan. In regard to the lifespan issue, there 
is a growing body of literature on cognitive 
aging and how aspects of the cognitive system 
such as attention, memory, vision and motor 
skills change as a function of aging (Fisk 
et al. 2009). This basic science research is of 
growing importance to informatics as we seek 
to develop e-health applications for seniors, 
many of whom suffer from chronic health 
conditions such as arthritis and diabetes. A 
graphical user interface or more generally, a 
website designed for younger adults may not 
be suitable for older adults. 

Differences in knowledge organization are 
a central focus of research into the nature of 
expertise. In medicine, the expert-novice para- 
digm has contributed to our understanding 


of the nature of medical expertise and skilled 
clinical performance. 


4.3.1 Cognitive Architectures 
and Human Memory Systems 


Fundamental research in perception, cogni- 
tion, and psychomotor skills over the last 
50 years has provided a foundation for design 
principles in human factors and human- 
computer interaction. Although cognitive 
guidelines have made significant inroads in 
the design community, there remains a signifi- 
cant gap in applying basic cognitive research 
(Gillan and Schvaneveldt 1999). Designers 
routinely violate basic assumptions about the 
human cognitive system. There are invari- 
ably challenges in applying basic research and 
theory to applications. More human-centered 
design and cognitive research can instrumen- 
tally contribute to such an endeavor (Zhang 
et al. 2004). 

Over the last 50 years, there have been 
several attempts to develop a unified theory 
of cognition. The goal of such a theory is 
to provide a single set of mechanisms for all 
cognitive behaviors from motor skills, lan- 
guage, memory, to decision making, prob- 
lem solving, and comprehension (Newell 
1990). Such a theory provides a means to put 
together a voluminous and seemingly dispa- 
rate body of human experimental data into a 
coherent form. Cognitive architecture repre- 
sents unifying theories of cognition that are 
embodied in large-scale computer simulation 
programs. Although there is much plasticity 
evidenced in human behavior, cognitive pro- 
cesses are bound by biological and physical 
constraints. Cognitive architectures specify 
functional rather than biological constraints 
on human behavior (e.g., limitations on 
working memory). These constraints reflect 
the information-processing capacities and 
limitations of the human cognitive system. 
Architectural systems embody a relatively 
fixed permanent structure that is (more or 
less) characteristic of all humans and doesn’t 
substantially vary over an individual’s life- 
time. It represents a scientific hypothesis about 
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those aspects of human cognition that are 
relatively constant over time and independent 
of the task (Carroll 2003). Cognitive architec- 
tures also play a role in providing blueprints 
for building future intelligent systems that 
embody a broad range of capabilities like 
those of humans (Duch et al. 2008). There 
are several large-scale cognitive architecture 
theories that embody computational models 
of cognition and have informed a substan- 
tial body of research in cognitive science and 
allied disciplines. ACT-R (short for “Adaptive 
Control of Thought-Rational”) is perhaps, 
the most widely known cognitive architec- 
ture. It was developed by John R. Anderson 
and is sustained by a large global community 
of researchers centered at Carnegie Mellon 
University (Anderson 2013). It is a theory for 
simulating and understanding human cogni- 
tion. It started more than 40 years ago as an 
architecture that could simulate basic tasks 
related to memory, language and problem 
solving. It has continued to evolve into a sys- 
tem that can perform an enormous range of 
human tasks (Ritter et al. 2019). 

Cognitive architectures include short- 
term and long-term memories that store con- 
tent about an individual’s beliefs, goals, and 
knowledge, the representation of elements 
that are contained in these memories as well 
as their organization into larger-scale struc- 
tures (Lieto et al. 2018). An extended discus- 
sion of architectural theories and systems is 
beyond the scope of this chapter. However, 
we employ the architectural frame of refer- 
ence to introduce some basic distinctions in 
memory systems. Human memory is typically 
divided into at least two structures: long-term 
memory and short-term/working memory. 
Working memory is an emergent property 
of interaction with the environment. Long- 
term memory (LTM) can be thought of as a 
repository of all knowledge, whereas working 
memory (WM) refers to the resources needed 
to maintain information active during cogni- 
tive activity (e.g., text comprehension). The 
information maintained in working memory 
includes stimuli from the environment (e.g., 
words on a display) and knowledge activated 
from long-term memory. In theory, LTM 


129 


is infinite, whereas WM is limited to five to 
ten “chunks” of information. A chunk is 
any stimulus or patterns of stimuli that have 
become familiar from repeated exposure and 
is subsequently stored in memory as a single 
unit (Larkin et al. 1980). Problems impose a 
variable cognitive load on working memory. 
This refers to an excess of information that 
competes for few cognitive resources, creat- 
ing a burden on working memory (Chandler 
and Sweller 1991). For example, maintaining 
a seven-digit phone number in WM is not very 
difficult. However, to maintain a phone num- 
ber while engaging in conversation is nearly 
impossible for most people. Multi-tasking 
is one factor that contributes to cognitive 
load. The structure of the task environment, 
for example, a crowded computer display 
is another contributor. High velocity/high 
workload clinical environments such as inten- 
sive care units also impose cognitive loads on 
clinicians carrying out the task. 


4.3.2 The Organization 
of Knowledge 


Architectural theories specify the structure 
and mechanisms of memory systems, whereas 
theories of knowledge organization focus on 
the content. There are several ways to char- 
acterize the kinds of knowledge that reside in 
LTM and that support decisions and actions. 
Cognitive psychology has furnished a range of 
domain-independent constructs that account 
for the variability of mental representations 
needed to engage the external world. 

A central tenet of cognitive science is that 
humans actively construct and interpret infor- 
mation from their environment. Given that 
environmental stimuli can take a multitude 
of forms (e.g., written text, speech, music, 
images, etc.), the cognitive system needs to 
be attuned to different representational types 
to capture the essence of these inputs. For 
example, we process written text differently 
than we do mathematical equations. The 
power of cognition is reflected in the ability to 
form abstractions - to represent perceptions, 
experiences, and thoughts in some medium 
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1. 43-year-old white female who developed diarrhea after a brief period of 2 days 


of GI upset 


1.1 female 


ATT: Age (old); DEG: 43 year; ATT: white 


1.2 develop PAT: [she]; THM: diarrhea; TNS: past 
1.3 period ATT: brief; DUR: 2 days; THM: 1.4 
1.4 upset LOC: GI 

1.5 TEM:ORD [1.3], [1.2] 


O Fig. 4.2 Propositional analysis of a think-aloud protocol of a primary care physician 


other than that in which they have occurred 
without extraneous or irrelevant information 
(Norman 1993). Representations enable us to 
remember, reconstruct, and transform events, 
objects, images, and conversations absent in 
space and time from our initial experience of 
the phenomena. Representations reflect states 
of knowledge. 

Propositions are a form of natural lan- 
guage representation that captures the essence 
of an idea (i.e., semantics) or concept with- 
out explicit reference to linguistic content. 
For example, “hello”, “hey”, and “what’s 
happening” can typically be interpreted 
as a greeting containing identical proposi- 
tional content even though the literal seman- 
tics of the phrases may differ. These ideas 
are expressed as language and translated 
into speech or text when we talk or write. 
Similarly, we recover the propositional struc- 
ture when we read or listen to verbal informa- 
tion. Numerous psychological experiments 
have demonstrated that people recover the 
gist of a text or spoken communication (i.e., 
propositional structure) not the specific words 
(Anderson 1985; van Dijk and Kintsch 1983). 
Studies have also shown the individuals at 
different levels of expertise will differentially 
represent a text (Patel and Kaufman 1998). 
For example, experts are more likely to selec- 
tively encode relevant propositional informa- 
tion that will inform a decision. On the other 
hand, non-experts will often remember more 
information, but much of the recalled infor- 
mation may not be relevant to the decision 
(Patel and Groen 1991a, b). 

Propositional representations constitute 
an important construct in theories of com- 
prehension. Propositional knowledge can be 
expressed using a predicate calculus formal- 
ism or as a semantic network. The predicate 


calculus representation is illustrated below. 
A subject’s response, as given on B Fig. 4.2, 
is divided into sentences or segments and 
sequentially analyzed. The formalism 
includes a head element of a segment and a 
series of arguments. For example, in proposi- 
tion 1.1, the focus is on a female who has the 
attributes of being 43 years of age and white. 
The TEM:ORD or temporal order relation 
indicates that the events of 1.3 (GI upset) pre- 
cede the event of 1.2 (diarrhea). The formal- 
ism is informed by an elaborate propositional 
language (Frederiksen 1975) and was first 
applied to the medical domain by Patel and 
her colleagues (Patel and Groen 1986). The 
method provides us with a detailed way to 
characterize the information subjects under- 
stood from reading a text, based on their sum- 
mary or explanations. 

Kintsch (1998) theorized that comprehen- 
sion involves an interaction between what 
the text conveys and knowledge in long-term 
memory. Comprehension occurs when the 
reader uses prior knowledge to process the 
incoming information presented in the text. 
The text information is called the textbase 
(the propositional content of the text). For 
instance, in medicine, the textbase could con- 
sist of the representation of a patient problem 
as written in a patient chart. The situation 
model is constituted by the textbase represen- 
tation plus the domain-specific and everyday 
knowledge that the reader uses to derive a 
broader meaning from the text. In medicine, 
the situation model would enable a physi- 
cian to draw inferences from a patient’s his- 
tory leading to a diagnosis, therapeutic plan 
or prognosis (Patel and Groen 1991a, b). This 
situation model is typically derived from the 
general knowledge and specific knowledge 
acquired through medical teaching, readings 
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(e.g., theories and findings from biomedical 
research), clinical practice (e.g., knowledge 
of associations between clinical findings and 
specific diseases, knowledge of medications 
or treatment procedures that have worked 
in the past) and the textbase representation. 
Like other forms of knowledge representa- 
tion, the situation model is used to “fit in” the 
incoming information (e.g., text, perception 
of the patient). Since the knowledge in LTM 
differs among physicians, the resulting situa- 
tion model generated by any two physicians is 
likely to differ as well. Theories and methods 
of text comprehension have been widely used 
in the study of medical cognition and have 
been instrumental in characterizing the pro- 
cess of guideline development and interpreta- 
tion (Peleg et al. 2006; Patel et al. 2014). 

Schemata represent higher-level knowl- 
edge structures. They can be construed as 
data structures for representing categories of 
concepts stored in memory (e.g., fruits, chairs, 
geometric shapes, and thyroid conditions). 
There are schemata for concepts underlying 
situations, events, sequences of actions and 
so forth. To process information with the use 
of a schema is to determine which model best 
fits the incoming information. Schemata have 
constants (all birds have wings) and variables 
(chairs can have between one and four legs). 
The variables may have associated default 
values (e.g., birds fly) that represent the proto- 
typical circumstance. 

When a person interprets information, the 
schema serves as a “filter” for distinguishing 
relevant and irrelevant information. Schemata 
can be considered as generic knowledge struc- 
tures that contain slots for particular kinds of 
propositions. For instance, a schema for myo- 
cardial infarction may contain the findings 
of “chest pain,” “sweating,” “shortness of 
breath,” but not the finding of “goiter,” which 
is part of the schema for thyroid disease. 

The schematic and propositional represen- 
tations reflect abstractions and don’t neces- 
sarily preserve literal information about the 
external world. Imagine that you are having a 
conversation at the office about how to rear- 
range the furniture in your living room. To 
engage in such a conversation, one needs to 
be able to construct images of the objects and 
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their spatial arrangement in the room. Mental 
images are a form of internal representation 
that captures perceptual information recov- 
ered from the environment. There is compel- 
ling psychological and neuropsychological 
evidence to suggest that mental images consti- 
tute a distinct form of mental representation 
(Bartolomeo 2008). Images play a particularly 
important role in domains of visual diagnosis 
such as dermatology and radiology. 

Mental models are an analog-based con- 
struct for describing how individuals form 
internal models of systems. Mental mod- 
els are designed to answer questions such as 
“how does it work?” or “what will happen if 
I take the following action?” “Analogy” sug- 
gests that the representation explicitly shares 
the structure of the world it represents (e.g., 
a set of connected visual images of a partial 
road map from your home to your work des- 
tination). This contrasts with an abstraction- 
based form such as propositions or schemas in 
which the mental structure consists of either 
the gist, an abstraction, or summary repre- 
sentation. However, like other forms of men- 
tal representation, mental models are always 
incomplete, imperfect, and subject to the pro- 
cessing limitations of the cognitive system. 
Mental models can be derived from percep- 
tion, language, or one’s imagination (Payne 
2003). Running of a model corresponds to a 
process of mental simulation to generate pos- 
sible future states of a system from observed 
or hypothetical state. For example, when one 
initiates a Google Search, one may reasonably 
anticipate that the system will return a list of 
relevant (and less than relevant) websites that 
correspond to the query. Mental models are a 
particularly useful construct in understanding 
human-computer interaction. 

An individual’s mental models provide 
predictive and explanatory capabilities of the 
function of a physical system. More often the 
construct has been used to characterize mod- 
els that have a spatial and temporal context, 
as is the case in reasoning about the behavior 
of electrical circuits (White and Frederiksen 
1990). The model can be used to simulate a 
process (e.g., predict the effects of network 
interruptions on getting cash from an ATM 
machine). Kaufman, Patel and Magder (1996) 
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characterized clinicians’ mental models of the 
cardiovascular system (specifically, cardiac 
output). The study characterized the develop- 
ment of an understanding of the system as a 
function of expertise. The research also docu- 
mented various conceptual flaws in subjects’ 
models and how these flaws impacted subjects’ 
predictions and explanations of physiological 
manifestations. © Figure 4.3 illustrates the 
four chambers of the heart and blood flow in 
the pulmonary and cardiovascular systems. 
The claim is that clinicians and medical stu- 
dents have variably robust representations of 
the structure and function of the system. This 
model enables prediction and explanation of 
the effects of perturbations in the system on 
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O Fig.4.3 Schematic model of circulatory and cardio- 
vascular physiology. The diagram illustrates various 
structures of the pulmonary and systemic circulation 
system and the process of blood flow. The illustration is 
used to exemplify the concept of mental model and how 
it could be applied to explaining and predicting physio- 
logic behavior 


blood flow and on various clinical measures 
such as left ventricular ejection fraction. 

Thus far, we have only considered domain- 
general ways of characterizing the organiza- 
tion of knowledge. In view to understanding 
the nature of medical cognition, it is necessary 
to characterize the domain-specific nature of 
knowledge organization in medicine. Given 
the vastness and complexity of the domain of 
medicine, this can be a rather daunting task. 
There is no single way to represent all biomed- 
ical (or even clinical) knowledge, but it is an 
issue of considerable importance for research 
in biomedical informatics. Much research has 
been conducted in biomedical artificial intel- 
ligence to develop biomedical ontologies for 
use in knowledge-based systems (Ramoni 
et al. 1992). Patel et al. (1997) address this 
issue in the context of using empirical evi- 
dence from psychological experiments on 
medical expertise to test the validity of the 
AI systems. Developers of biomedical taxono- 
mies, nomenclatures, and vocabulary systems 
such as UMLS or SNOMED are engaged in a 
similar pursuit (see > Chap. 7). 

We have employed an epistemological 
framework developed by Evans and Gadd 
(1989). They proposed a framework that 
serves to characterize the knowledge used for 
medical understanding and problem solving, 
and for differentiating the levels at which bio- 
medical knowledge may be organized. This 
framework represents a formalization of bio- 
medical knowledge as realized in textbooks 
and journals and can be used to provide us 
with insight into the organization of clinical 
practitioners’ knowledge (see B Fig. 4.4). 

The framework consists of a hierarchical 
structure of concepts formed by clinical obser- 
vations at the lowest level, followed by findings, 
facets, and diagnoses. Clinical observations 
are units of information that are recognized 
as potentially relevant in the problem-solving 
context. However, they do not constitute clini- 
cally useful facts. Findings are composed of 
observations that have potential clinical sig- 
nificance. Establishing a finding reflects a 
decision made by a physician that an array 
of data contains a significant cue or cues that 
need to be considered. Facets consist of clus- 
ters of findings that indicate an underlying 
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Finding level 


Observation level 


O Fig.4.4 Epistemological frameworks representing the structure of medical knowledge for problem solving 


medical problem or class of problems. They 
reflect general pathological descriptions such 
as left-ventricular failure or thyroid condi- 
tion. Facets resemble the kinds of constructs 
used by researchers in medical artificial intel- 
ligence to describe the partitioning of a prob- 
lem space. They are interim hypotheses that 
serve to divide the information in the problem 
into sets of manageable sub-problems and to 
suggest possible solutions. Facets also vary in 
terms of their levels of abstraction. Diagnosis 
is the level of classification that subsumes and 
explains all levels beneath it. Finally, the sys- 
tems level consists of information that serves 
to contextualize a problem, such as the ethnic 
background of a patient. 


44 Medical Cognition 

The study of expertise is one of the princi- 
pal paradigms in problem-solving research, 
which has been documented in a number of 
volumes in literature (Sternberg and Ericsson 
1996; Ericsson 2009; Ericsson et al. 2018). 
Comparing experts to novices provides us 
with the opportunity to explore the aspects of 
performance that undergo change and result 
in increased problem-solving skill (Glaser 
2000). It also permits investigators to develop 
domain-specific models of competence that 
can be used for assessment and training 
purposes. 


A goal of this approach has been to char- 
acterize expert performance in terms of the 
knowledge and cognitive processes used in 
comprehension, problem solving, and decision 
making, using carefully developed laboratory 
tasks (Chi and Glaser 1981), where deGroot’s 
(1965) pioneering research in chess represents 
one of the earliest characterizations of expert- 
novice differences. In one of his experiments, 
subjects were allowed to view a chess board for 
5-10 seconds and were then required to repro- 
duce the position of the chess pieces from 
memory. The grandmaster chess players were 
able to reconstruct the mid-game positions 
with better than 90% accuracy, while novice 
chess players could only reproduce approxi- 
mately 20% of the correct positions. When 
the chess pieces were placed on the board in a 
random configuration, not encountered in the 
course of a normal chess match, expert chess 
masters’ recognition ability fell to that of 
novices. This result suggests that superior rec- 
ognition ability is not a function of superior 
memory, but is a result of an enhanced abil- 
ity to recognize typical situations (Chase and 
Simon 1973). This phenomenon is accounted 
for by a process known as “chunking.” Patel 
and Groen (1991b) showed a similar phenom- 
enon in medicine The expert physicians were 
able to reconstruct patient summaries in an 
accurate manner when patient information 
was collected out of order (e.g., history, physi- 
cal exam, lab results), as long as the pattern of 
information, even out of sequence was famil- 
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iar. When the sentences were placed out of 
order in a way that the pattern was unfamiliar, 
the expert physicians’ recognition ability was 
no better than the novices. 

It is well known that knowledge-based dif- 
ferences impact the problem representation 
and determine the strategies a subject uses 
to solve a problem. Simon and Simon (1978) 
compared a novice subject with an expert sub- 
ject in solving textbook physics problems. The 
results indicated that the expert solved the 
problems in one-quarter of the time required 
by the novice with fewer errors. The nov- 
ice solved most of the problems by working 
backward from the unknown problem solu- 
tion to the givens of the problem statement. 
The expert worked forward from the givens 
to solve the necessary equations and deter- 
mine the quantities they are asked to solve for. 
Differences in the directionality of reasoning 
by levels of expertise has been demonstrated 
in diverse domains from computer program- 
ming (Perkins et al. 1990) to medical diagno- 
sis (Patel and Groen 1986). 

The expertise paradigm spans the range 
of content domains including physics (Larkin 
et al. 1980), sports (Allard and Starkes 1991), 
music (Sloboda 1991), and medicine (Patel 
et al. 1994). Edited volumes (Ericsson 2006; 
Chi et al. 1988 Ericsson et al. 2018; Ericsson 
and Smith 1991; Hoffman 1992) provide an 
informative general overview of the area. This 
research has focused on differences between 
subjects varying in levels of expertise in terms 
of memory, reasoning strategies, and in par- 
ticular the role of domain-specific knowledge. 
Among the expert’s characteristics uncovered 
by this research are the following: (1) experts 
are capable of perceiving large patterns of 
meaningful information in their domain, 
which novices cannot perceive; (2) they are 
fast at processing and at deployment of dif- 
ferent skills required for problem solving; (3) 
they have superior short-term and long-term 
memories for materials (e.g., clinical findings 
in medicine) within their domain of expertise, 
but not outside of it; (4) they typically rep- 
resent problems in their domain at deeper, 
more principled levels whereas novices show 
a superficial level of representation; (5) they 
spend more time assessing the problem prior 


to solving it, while novices tend to spend more 
time working on the solution itself and little 
time in problem assessment; (6) individual 
experts may differ substantially in terms of 
exhibiting these kinds of performance char- 
acteristics (e.g., superior memory for domain 
materials). 

Usually, someone is designated as an 
expert based on a certain level of performance, 
as exemplified by Elo ratings in chess; by vir- 
tue of being certified by a professional licens- 
ing body, as in medicine, law, or engineering; 
on the basis of academic criteria, such as 
graduate degrees; or simply based on years of 
experience or peer evaluation (Hoffman et al. 
1995). The concept of an expert, however, 
refers to an individual who surpasses com- 
petency in a domain (Sternberg and Horvath 
1999). Although competent performers, for 
instance, may be able to encode relevant infor- 
mation and generate effective plans of action 
in a specific domain, they often lack the speed 
and the flexibility that we see in an expert. A 
domain expert (e.g., a medical practitioner) 
possesses an extensive, accessible knowledge 
base that is organized for use in practice and 
is tuned to the particular problems at hand. In 
the study of medical expertise, it has been use- 
ful to distinguish different types of expertise. 

Patel and Groen (1991a, b) distinguished 
between general and specific expertise, a dis- 
tinction supported by research indicating 
differences between subexperts (i.e., expert 
physicians who solve a case outside their field 
of specialization) and experts (i.e., domain 
specialist) with respect to reasoning strate- 
gies and organization of knowledge. General 
expertise corresponds to expertise that cuts 
across medical subdisciplines (e.g., general 
medicine). Specific expertise results from 
having extensive experience within a medical 
subdomain, such as cardiology or endocrinol- 
ogy. An individual may possess both or only 
generic expertise. 

The development of expertise can follow 
a somewhat unusual trajectory. It is often 
assumed that the path from novice to expert 
goes through a steady process of gradual 
accumulation of knowledge and fine-tuning 
of skills. That is, as a person becomes more 
familiar with a domain, his or her level of per- 
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formance (e.g., accuracy, quality) gradually 
increases. However, research has shown that 
this assumption is often incorrect (Lesgold 
et al. 1988; Patel et al. 1994). Cross-sectional 
studies of experts, intermediates, and novices 
have shown that people at intermediate levels 
of expertise may perform more poorly than 
those at a lower level of expertise on some 
tasks. 

Furthermore, there is a longstanding body 
of research on learning that has suggested that 
the learning process involves phases of error- 
filled performance followed by periods of sta- 
ble, comparatively error-free performance. In 
other words, human learning does not consist 
of the gradually increasing accumulation of 
knowledge and fine-tuning of skills. Rather, 
it requires the arduous process of continu- 
ally learning, re-learning, and exercising new 
knowledge, punctuated by periods of an appar- 
ent decrease in mastery and declines in perfor- 
mance, which may be necessary for learning to 
take place. © Figure 4.5 presents an illustration 
of this learning and development phenomenon 
known as the intermediate effect. 

The intermediate effect has been found 
in a variety of tasks and with a great num- 
ber of performance indicators. The tasks used 
include comprehension and explanation of 
clinical problems, doctor-patient communica- 


expected 


actual 


Performance Level 


T T 
Novice Intermediate 


Expert 
Development 


O Fig. 4.5 Schematic representation of intermediate 
effect. The straight line gives a commonly assumed rep- 
resentation of performance development by level of 
expertise. The curved line represents the actual develop- 
ment from novice to expert. The Y-axis may represent 
any of a number of performance variables such as the 
number of errors made, number of concepts recalled, 
number of conceptual elaborations, or number of 
hypotheses generated in a variety of tasks 
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tion, recall and explanation of laboratory data, 
generation of diagnostic hypotheses, and prob- 
lem solving (Patel and Groen 1991a, b). The 
performance indicators used have included 
recall and inference of medical-text infor- 
mation, recall, and inference of diagnostic 
hypotheses, generation of clinical findings 
from a patient in doctor-patient interaction, 
and requests for laboratory data, among oth- 
ers. The research has also identified devel- 
opmental levels at which the intermediate 
phenomenon occurs, including senior medi- 
cal students and residents. It is important to 
note, however, that in some tasks, the develop- 
ment is monotonic. For instance, in diagnostic 
accuracy, there is a gradual increase, with an 
intermediate exhibiting a greater degree of 
accuracy than the novice and the expert dem- 
onstrating a still greater degree than the inter- 
mediate. Furthermore, when the relevancy 
of the stimuli to a problem is considered, an 
appreciable monotonic phenomenon appears. 
For instance, in recall studies, novices, inter- 
mediates, and experts are assessed in terms 
of the total number of propositions recalled 
showing the typical non-monotonic effect. 
However, when propositions are divided in 
terms of their relevance to the problem (e.g., 
a clinical case), experts recall more relevant 
propositions than intermediates and novices, 
suggesting that intermediates have difficulty 
separating what is relevant from what is not. 
During the periods when the intermediate 
effect occurs, a reorganization of knowledge 
and skills takes place, characterized by shifts 
in perspectives or a realignment or creation of 
goals. The intermediate effect is also partly due 
to the unintended changes that take place as 
the person reorganizes for intended changes. 
People at intermediate levels typically gen- 
erate a great deal of irrelevant information 
and seem incapable of discriminating what 
is relevant from what is not. As compared 
to a novice student (@ Fig. 4.6), the reason- 
ing pattern of an intermediate student shows 
the generation of long chains of discussion 
evaluating multiple hypotheses and reasoning 
in haphazard direction (@ Fig. 4.7). A well- 
structured knowledge structure of a senior 
level student leads him more directly to a 
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infarction 


Other 
diagnosis 


O Fig. 4.6 Problem interpretations by a novice medi- 
cal student. The given information from patient prob- 
lem is represented on the right side of the figure and the 
new generated information is given on the left side, 
information in the box represents diagnostic hypothesis. 
Intermediate hypotheses are represented as solid dark 
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45 years-old 
male 


circles (filled). Forward driven or data driven inference 
arrows are shown from left to right (solid dark line). 
Backward or hypothesis driven inference arrows are 
shown from right to left (solid light line). Thick solid 
dark line represents rule out strategy 
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ae 


O Fig.4.7 Problem interpretations by an intermediate medical student 


solution (@ Fig. 4.8). Thus, the intermediate 
effect can be explained as a function of the 
learning process, maybe as a necessary phase 
of learning. Identifying the factors involved in 
the intermediate effect may help in improving 
performance during learning (e.g., by design- 
ing decision-support systems or intelligent 
tutoring systems that help the user in focusing 
on relevant information). 


The intermediate effect is not a one-time 
phenomenon. Rather, it repeatedly occurs at 
strategic points in a student or physician’s 
training and follows periods in which large 
bodies of new knowledge or complex skills 
are acquired. These periods are followed by 
intervals in which there is a decrement in 
performance until a new level of mastery is 
achieved. 
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O Fig.4.8 Problem 
interpretations by a 
senior medical 
student 
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4.4.1 Expertise in Medicine 


The systematic investigation of medical 
expertise began more than 60 years ago with 
research by Ledley and Lusted (1959) into the 
nature of clinical inquiry. They proposed a 
two-stage model of clinical reasoning involv- 
ing a hypothesis-generation stage, followed by 
a hypothesis-evaluation stage. This latter stage 
is most amenable to formal decision ana- 
lytic techniques. The earliest empirical stud- 
ies of medical expertise can be traced to the 
works of Rimoldi (1961) and Kleinmuntz and 
McLean (1968) who conducted experimental 
studies of diagnostic reasoning by contrast- 
ing students with medical experts in simulated 
problem-solving tasks. The results empha- 
sized the greater ability of expert physicians to 
attend to relevant information selectively and 
narrow the set of diagnostic possibilities (i.e., 
consider fewer hypotheses). 

The origin of contemporary research on 
medical thinking is associated with the semi- 
nal work of Elstein, Shulman, and Sprafka 
(1978) who studied the problem-solving 
processes of physicians by drawing on then- 
contemporary methods and theories of cog- 
nition. This model of problem-solving has 
had a substantial influence both on studies 
of medical cognition and medical education. 
They were the first to use experimental meth- 
ods and theories of cognitive science to inves- 
tigate clinical competency. 


Their research findings led to the develop- 
ment of an elaborated model of hypothetico- 
deductive reasoning, which proposed that 
physicians reasoned by first generating and 
then testing a set of hypotheses to account for 
clinical data (i.e., reasoning from hypothesis 
to data). First, physicians generated a small 
set of hypotheses very early in the case, as 
soon as the first pieces of data became avail- 
able. Second, physicians were selective in the 
data they collected, focusing only on the rel- 
evant data. Third, physicians made use of a 
hypothetico-deductive method of diagnostic 
reasoning (Elstein et al. 1978). 

The previous research was largely mod- 
eled after early problem-solving stud- 
ies in knowledge-lean tasks. Medicine is 
a knowledge-rich domain, and a different 
approach was needed. Feltovich, Johnson, 
Moller, and Swanson (1984), drawing on 
models of knowledge representation from 
medical artificial intelligence, character- 
ized fine-grained differences in knowledge 
organization between subjects of different 
levels of expertise in the domain of pediat- 
ric cardiology. Patel and colleagues studied 
the knowledge-based solution strategies of 
expert cardiologists as evidenced by their 
pathophysiological explanations of a complex 
clinical problem (Patel and Groen 1986). The 
results indicated that subjects who accurately 
diagnosed the problem, employed a forward- 
directed (data-driven) reasoning strategy— 
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using patient data to lead toward a complete 
diagnosis (i.e., reasoning from data to hypoth- 
esis). This is in contrast to subjects who mis- 
diagnosed or partially diagnosed the patient 
problem. They tended to use a backward or 
hypothesis-driven reasoning strategy. 

Patel and Groen (1991a, b) investigated the 
nature and directionality of clinical reasoning 
in a range of contexts of varying complex- 
ity. The objectives of this research program 
were both to advance our understanding of 
medical expertise and to devise more effec- 
tive ways of teaching clinical problem solving. 
It has been established that the patterns of 
data-driven and hypothesis-driven reasoning 
are used differentially by novices and experts. 
Experts tend to use data-driven reasoning, 
which depends on the physician possessing a 
highly organized knowledge base about the 
patient’s disease (including sets of signs and 
symptoms). Because of their lack of substan- 
tive knowledge or their inability to distinguish 
relevant from irrelevant knowledge, novices 
and intermediates use more hypothesis-driven 
reasoning, often resulting in very complex 
reasoning patterns. The fact that experts and 
novices reason differently suggests that they 
might reach different conclusions (e.g., deci- 
sions or understandings) when solving medi- 
cal problems. Similar patterns of reasoning 
have been found in other domains (Larkin 
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et al. 1980). Due to their extensive knowledge 
base and the high-level inferences they make, 
experts typically skip steps in their reasoning. 

Although experts typically use data-driven 
reasoning during clinical performance, this 
type of reasoning sometimes breaks down, and 
the expert must resort to hypothesis-driven 
reasoning. Although data-driven reasoning is 
highly efficient, it is often error-prone in the 
absence of adequate domain knowledge, since 
there are no built-in checks on the legitimacy 
of the inferences that a person makes. Pure 
data-driven reasoning is only successful in 
constrained situations, where one’s knowledge 
of a problem can result in a complete chain 
of inferences from the initial problem state- 
ment to the problem solution, as illustrated 
in © Fig. 4.9. In contrast, hypothesis-driven 
reasoning is slower and may make heavy 
demands on working memory, because one 
must keep track of goals and hypotheses. It is, 
therefore, most likely to be used when domain 
knowledge is inadequate, or the problem is 
complex. Hypothesis-driven reasoning is usu- 
ally exemplary of a weak method of problem 
solving in the sense that is used in the absence 
of relevant prior knowledge and when there is 
uncertainty about problem solution. In prob- 
lem-solving terms, strong methods engage 
knowledge, whereas weak methods refer to 
general strategies. Weak does not necessar- 
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thyroid disease thyroid function 

Examination COND: 

of thyroid 

Respiratory ~<—CAU: —— Hypoventilation <- CAU: — Hypometabolic state 
failure 


tooo RSLT: 


O Fig. 4.9 Diagrammatic representation of data- 
driven (top down) and hypothesis-driven (bottom-up) 
reasoning. From the presence of vitiligo, a prior history 
of progressive thyroid disease, and examination of the 
thyroid (clinical findings on the left side of figure), the 
physician reasons forward to conclude the diagnosis of 
Myxedema (right of figure). However, the anomalous 


o 


finding of respiratory failure, which is inconsistent with 
the main diagnosis, is accounted for as a result of a 
hypometabolic state of the patient, in a backward- 
directed fashion. COND: refers to a conditional rela- 
tion; CAU: indicates a causal relation; and RSLT: 
identifies a resultive relation 
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ily imply ineffectual in this context. However, 
hypothesis-driven reasoning may be more 
conducive to the novice learning experience in 
that it can guide the organization of knowl- 
edge (Patel et al. 1990). 

In the more recent literature, described 
in a chapter by Patel and colleagues (2013a), 
two forms of human reasoning that are more 
widely accepted are deductive and inductive 
reasoning. Deductive reasoning is a process of 
reaching specific conclusions (e.g., a diagno- 
sis) from a hypothesis or a set of hypotheses, 
whereas inductive reasoning is the process 
of generating possible conclusions based on 
available data, such as data from a patient. 
However, when reasoning in real-world clini- 
cal situations, it is too simplistic to think of 
reasoning with only these two strategies. A 
third form of reasoning, abductive, which 
combines deductive and inductive reason- 
ing, was proposed (Peirce 1955). A physician 
developing and testing explanatory hypoth- 
eses based on a set of heuristics, may be con- 
sidered abductive reasoning (Magnani 2001). 
Thus, an abductive reasoning process where a 
set of hypotheses are identified and then each 
of these hypotheses is evaluated on the basis 
of its potential consequences (Elstein et al. 
1978; Ramoni et al. 1992). This makes abduc- 
tive reasoning a data-driven process that relies 
heavily on the domain expertise of the person. 

During the testing phase, hypotheses are 
evaluated by their ability to account for the 
current problem. Deduction helps in building 
up the consequences of each hypothesis, and 
this kind of reasoning is customarily regarded 
as a common way of evaluating diagnostic 
hypotheses (Joseph and Patel 1990; Kassirer 
1989; Patel et al. 1994; Patel, Evans, and 
Kaufman 1989). All these types of inferences 
play different roles in the hypothesis genera- 
tion and testing phases (Patel and Ramoni 
1997; Peirce 1955). Our inherent ability 
to adapt to different kinds of knowledge 
domains, situations, and problems requires 
the use of a variety of reasoning modes, and 
this process describes the notion of abductive 
medical reasoning (Patel and Ramoni 1997). 
In contrast, novices and intermediate sub- 
jects (e.g., medical trainees) are more likely 
to employ more deliberative, effortful, and 
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cognitively taxing forms of reasoning that can 
resemble hypothetico-deductive methods. As 
problems increase in complexity and uncer- 
tainness, expert clinicians’ resort to hybrid 
forms of reasoning that may include substan- 
tial backward-directed reasoning. 

The study of medical cognition has been 
summarized in a series of articles (Patel et al. 
1994, 2018) and edited volumes (e.g., Evans 
and Patel 1989). In more recent times, medi- 
cal cognition is discussed in the context of 
informatics and in the new field of inves- 
tigation, cognitive informatics (Patel and 
Kannampallil 2015; Patel et al. 2014, 2015b, 
2017). Furthermore, foundations of cognition 
also play a significant role in investigations 
of HCI, including human factors and patient 
safety. Details of HCI in biomedicine are cov- 
ered in > Chap. 5. 


Human Factors Research 
and Patient Safety 


4.5 


» Human error in medicine and the adverse 
events which may follow are problems of 
psychology and engineering not of medicine 
“(Senders 1993)” (cited in (Woods et al. 
2008). 


Human factors research is a discipline devoted 
to the study of technology systems and how 
people work with them or are impacted by 
these technologies (Henriksen 2010). Human 
factors research discovers and applies infor- 
mation about human behavior, abilities, limi- 
tations, and other characteristics to the design 
of tools, machines, systems, tasks, and jobs, 
and environments for productive, safe, com- 
fortable, and effective human use (Chapanis 
1996). In the context of healthcare, human 
factors are concerned with the full comple- 
ment of technologies and systems used by a 
diverse range of individuals including clini- 
cians, hospital administrators, health con- 
sumers and patients (Flin and Patey 2009). 
Human factors work approaches the study of 
health practices from several perspectives or 
levels of analysis. A full exposition of human 
factors in medicine is beyond the scope of 
this chapter. For a detailed treatment of these 
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issues, the reader is referred to the Handbook 
of Human Factors and Ergonomics in Health 
Care and Patient Safety (Carayon et al. 2011). 
The focus in this chapter is on cognitive work 
in human factors and healthcare, particularly 
in relation to patient safety. We recognize that 
patient safety is a systemic challenge at mul- 
tiple levels of aggregation beyond the individ- 
ual. It is clear that understanding, predicting, 
and transforming human performance in any 
complex setting requires a detailed under- 
standing of both the setting and the factors 
that influence performance (Woods et al. 
2008). 

Our objective in this section is to introduce 
a theoretical foundation, establish important 
concepts, and discuss illustrative research 
in patient safety. The field of human fac- 
tors is guided by principles of engineering 
and applied cognitive psychology (Chapanis 
1996). Human factors analysis applies knowl- 
edge about the strengths and limitations of 
humans to the design of interactive systems 
and their environment. The objective is to 
ensure their effectiveness, safety, and ease of 
use. Mental models and issues of decision 
making are central to human-factors analysis. 
Any system will be easier and less burdensome 
to use to the extent that it is co-extensive with 
users’ mental models. The different dimen- 
sions of cognitive capacity, including memory, 
attention, and workload are central to human- 
factor analyses. Our perceptual system inun- 
dates us with more stimuli than our cognitive 
systems can process. Attentional mechanisms 
enable us to selectively prioritize and attend 
to certain stimuli and attenuate other ones. 
They also have the property of being sharable, 
which enables us to multitask by dividing our 
attention between two activities. For example, 
if we are driving on a highway, we can eas- 
ily have a conversation with a passenger at 
the same time. However, as the skies get dark 
or the weather changes or suddenly you find 
yourself driving through winding mountain- 
ous roads, you will have to allocate more of 
your attentional resources to driving and less 
to the conversation. 

Human factors research leverages theories 
and methods from cognitive engineering to 
characterize human performance in complex 


settings and challenging situations in aviation, 
industrial process control, military command 
control and space operations (Woods et al. 
2008). The research has elucidated empiri- 
cal regularities and provides explanatory 
concepts and models of human performance. 
This allows us to derive common underlying 
patterns in somewhat disparate settings. 


4.5.1 Patient Safety 


Patient safety refers to the prevention of 
healthcare errors, and the elimination or miti- 
gation of patient injury caused by healthcare 
errors (Patel and Zhang 2007). It has been 
an issue of considerable concern for the past 
quarter-century, but the greater community 
was galvanized by the National Academy 
of Medicine report “To Err is Human,” 
(Kohn et al. 2000) and by a follow-up report, 
“Improving Diagnosis in Health Care” 
(Balogh et al. 2015). The 2000 report commu- 
nicated the surprising fact that up to 98,000 
preventable deaths every single year in the 
United States are attributable to human error, 
which makes it the 8th leading cause of death 
in this country. Although one may argue over 
the specific numbers, there is no disputing that 
too many patients are harmed or die every 
year as a result of human actions or absence 
of action. 

We can only analyze errors after they 
happened, and they often seem to be glar- 
ing blunders after the fact. This leads to the 
assignment of blame or searches for a single 
cause of the error. However, in hindsight, it is 
exceedingly difficult to recreate the situational 
context, stress, shifting attention demands, 
and competing goals that characterized a 
situation prior to the occurrence of an error. 
This sort of retrospective analysis is subject 
to hindsight bias. Hindsight bias masks the 
dilemmas, uncertainties, demands, and other 
latent conditions that were operative before 
the mishap. Too often the term ‘human error’ 
connotes blame and a search for the guilty 
culprits, suggesting some sort of human defi- 
ciency or irresponsible behavior. Human fac- 
tors researchers recognized that this approach 
error is inherently incomplete and poten- 
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tially misleading. They argue for the need 
for a more comprehensive systems-centered 
approach that recognizes that error could be 
attributed to a multitude of factors as well as 
the interaction of these factors. Error is the 
failure of a planned sequence of mental or 
physical activities to achieve its intended out- 
come when these failures cannot be attributed 
to chance (Patel and Zhang 2007; Reason 
1990). Reason (1990) introduced an impor- 
tant distinction between latent and active 
failures. Active failure represents the face of 
error. The effects of active failure are immedi- 
ately felt. In healthcare, active errors are com- 
mitted by providers such as nurses, physicians, 
or pharmacists who are actively responding to 
patient needs at the “sharp end”. The latent 
conditions are less visible but equally impor- 
tant. Latent conditions are enduring systemic 
problems that may not be evident for some 
time, combine with other system problems to 
weaken the system’s defenses and make errors 
possible. There is a lengthy list of potential 
latent conditions including poor interface 
design of important technologies, communi- 
cation breakdown between key actors, gaps in 
supervision, inadequate training, and absence 
of a safety culture in the workplace—a cul- 
ture that emphasizes safe practices and the 
reporting of any conditions that are poten- 
tially dangerous. 

Zhang, Patel, Johnson, and Shortliffe 
(2004) developed a taxonomy of errors par- 
tially based on the distinctions proposed 
by Reason (1990). They further classified 
errors in terms of slips and mistakes (Reason 
1990). A slip occurs when the actor selected 
the appropriate course of action, but it was 
executed inappropriately. A mistake involves 
an inappropriate course of action reflecting 
an erroneous judgment or inference (e.g., a 
wrong diagnosis or misreading of an x-ray). 
Mistakes may either be knowledge-based 
owing to factors such as incorrect knowl- 
edge, or they may be rule-based, in which 
case the correct knowledge was available, but 
there was a problem in applying the rules or 
guidelines. They further characterize medical 
errors as a progression of events. There is a 
period when everything is operating smoothly. 
Then an unsafe practice unfolds resulting in 
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a kind of error, but not necessarily leading 
to an adverse event. For example, if there is 
a system of checks and balances that is part 
of routine practice or if there is a systematic 
supervisory process in place, the vast majority 
of errors will be trapped and defused in this 
middle zone. If these measures or practices are 
not in place, an error can propagate and cross 
the boundary to become an adverse event. At 
this point, the patient has been harmed. In 
addition, if an individual is subject to a heavy 
workload or intense time pressure, then that 
will increase the potential for an error, result- 
ing in an adverse event. 

The notion that human error should not 
be tolerated is prevalent in both the public 
and personal perception of the performance 
of most clinicians. However, researchers in 
other safety-critical domains have long since 
abandoned the quest for zero defect, citing it 
as an impractical goal, and choosing to focus 
instead on the development of strategies to 
enhance the ability to recover from error 
(Morel et al. 2008). Patel and her colleagues 
conducted empirical investigations into error 
detection and recovery by experts (attending 
physicians) and non-experts (resident train- 
ees) in the critical care domain, using both 
laboratory-based and naturalistic approaches 
(Patel et al. 2011). These studies show that 
expertise is more closely tied to the ability 
to detect and recover from errors and not so 
much to the ability not to make errors. The 
study results show that both the experts and 
non-experts are prone to commit and recover 
from errors, but experts’ ability to detect and 
recover from knowledge-based errors is better 
than that of trainees. Error detection and cor- 
rection in complex real-time critical care situ- 
ations appears to induce certain urgency for 
quick action in a high alert condition, result- 
ing in rapid detection and correction. Studies 
on expertise and understanding of the limits 
and failures of human decision-making are 
important if we are to build robust decision- 
support systems to manage the boundaries of 
risk of error in decision making (Patel et al. 
2015a; Patel and Cohen 2008). Research on 
situational complexity and medical errors 
is documented in a recent book by Patel, 
Kaufman, and Cohen (2014). 
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4.5.2 Unintended Consequences 


It is widely believed that health information 
technologies have the potential to transform 
healthcare in a multitude of ways, including 
the reduction of errors. However, it is increas- 
ingly apparent that technology-induced errors 
are deeply consequential and have had delete- 
rious consequences for patient safety. 

There is evidence to suggest that a poorly 
designed user interface can present substan- 
tial challenges even for the well-trained and 
highly skilled user (Zhang et al. 2003). Lin 
et al. (1998) conducted a series of studies on a 
patient-controlled analgesic or PCA device, a 
method of pain relief that uses disposable or 
electronic infusion devices and allows patients 
to self-administer analgesic drugs as required. 
Lin and colleagues investigated the effects 
of two interfaces to a commonly used PCA 
device, including the original interface. Based 
on cognitive task analysis, they redesigned the 
original interface so that it was more in line 
with sound human factors principles. Based 
on the cognitive task analysis, they found the 
existing PCA interface to be problematic in 
several different ways. For example, the struc- 
ture of many subtasks in the programming 
sequence was unnecessarily complex. There 
was a lack of information available on the 
screen to provide meaningful feedback and to 
structure the user experience (e.g., negotiating 
the next steps). For example, a nurse would 
not know that he or she was on the third 
of five screens or when they were half way 
through the task. Based on the CTA analysis, 
Lin et al. (1998) also redesigned the interface 
according to sound human factors principles 
and demonstrated significant improvements in 
efficiency, error rate, and reported workload. 

Zhang and colleagues employed a modi- 
fied heuristic evaluation method (see » Sect. 
4.5, above) to test the safety of two infusion 
pumps (Zhang et al. 2003). Based on an analy- 
sis by four evaluators, a total of 192 violations 
with the user interface design were docu- 
mented. Consistency and visibility (the ease 
in which a user can discern the system state) 
were the most widely documented violations. 
Several of the violations were classified as 


problems of substantial severity. Their results 
suggested that one of the two pumps were 
likely to induce more medical errors than the 
other ones. 

It is clear that usability problems are con- 
sequential and have the potential to impact 
patient safety. Kushniruk et al. (2005) exam- 
ined the relationship between particular kinds 
of usability problems and errors in a handheld 
prescription writing application. They found 
that particular usability problems were associ- 
ated with the occurrence of an error in enter- 
ing the medication. For example, the problem 
of inappropriate default values automatically 
populating the screen was found to be corre- 
lated with errors in entering the wrong dos- 
ages of medications. In addition, certain types 
of errors were associated with mistakes (not 
detected by users) while others were associated 
with slips about unintentional errors. Horsky 
et al. (2005) analyzed a problematic medica- 
tion order placed using a CPOE system that 
resulted in an overdose of potassium chloride 
being administered to an actual patient. The 
authors used a range of investigative methods 
including inspection of system logs, semi- 
structured interviews, the examination of the 
electronic health record, and cognitive evalua- 
tion of the order entry system involved. They 
found that the error was due to a confluence 
of factors including problems associated with 
the display, the labeling of functions, and 
ambiguous dating of the dates in which medi- 
cation was administered. The poor interface 
design did not assist with the decision-making 
process, and in fact, its design served as a hin- 
drance, where the interface was a poor fit for 
the conceptual operators utilized by clinicians 
when calculating medication dosage (i.e., 
based on volume, not duration). 

Koppel et al. (2005) published an influen- 
tial study examining how computer-provider 
order-entry systems (CPOE) facilitated medi- 
cal errors. The study, which was published 
in JAMA (Journal of the American Medical 
Association), used a series of methods includ- 
ing interviews with clinicians, observations, 
and a survey to document the range of errors. 
According to the authors, the system facili- 
tated 22 types of medication errors, and many 
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of them occurred with some frequency. The 
errors were classified into two broad cat- 
egories: (1) information errors generated by 
fragmentation of data and failure to inte- 
grate the hospital’s information systems and 
(2) human-machine interface flaws reflecting 
machine rules that do not correspond to work 
organization or usual behaviors. 

The growing body of research on unin- 
tended consequences spurred the American 
Medical Informatics Association to devote 
a policy meeting to consider ways to under- 
stand and diminish their impact (Bloomrosen 
et al. 2011). The matter is especially press- 
ing given the increased implementation of 
health information technologies nationwide, 
including ambulatory care practices that 
have little experience with health information 
technologies. The authors outline a series of 
recommendations, including a need for more 
cognitively-oriented research to guide the 
study of the causes and mitigation of unin- 
tended consequences resulting from health 
information technology implementations. 
These changes could facilitate improved man- 
agement of those consequences, resulting in 
enhanced performance, patient safety, as well 
as greater user acceptance. 


4.5.3 Distributed Cognition 
and Electronic Health Records 


In this chapter, we have considered a classi- 
cal model of information-processing cogni- 
tion in which mental representations mediate 
all activity and constitute the central units 
of analysis. The analysis emphasizes how an 
individual formulates internal representations 
of the external world. To illustrate the point, 
imagine an expert user of a word processor 
who can effortlessly negotiate tasks through 
a combination of key commands and menu 
selections. The traditional cognitive analysis 
might account for this skill by suggesting that 
the user has formed an image or schema of the 
layout structure of each of eight menus, and 
retrieves this information from memory each 
time an action is to be performed. For exam- 
ple, if the goal is to “insert a clip art icon,” 
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the user would recall that this is subsumed 
under pictures that are the ninth item on the 
“Insert” menu and then execute the action, 
thereby achieving the goal. However, there 
are some problems with this model. Mayes, 
Draper, McGregor, and Koatley (1988) dem- 
onstrated that even highly skilled users could 
not recall the names of menu headers, yet they 
could routinely make fast and accurate menu 
selections. The results indicate that many or 
even most users relied on cues in the display 
to trigger the right menu selections. This sug- 
gests that the display can have a central role 
in controlling interaction in graphical user 
interfaces. 

As discussed, the conventional 
information-processing approach has come 
under criticism for its narrow focus on the 
rational/cognitive processes of the solitary 
individual. In the previous section, we consid- 
ered the relevance of external representations 
to cognitive activity. The emerging perspec- 
tive of distributed cognition offers a more 
far-reaching alternative. The distributed view 
of cognition represents a shift in the study of 
cognition from being the sole property of the 
individual to being “stretched” across groups, 
material artifacts, and cultures (Hutchins 
1995; Suchman 1987). This viewpoint is 
increasingly gaining acceptance in cognitive 
science and human-computer interaction 
research. In the distributed approach to HCI 
research, cognition is viewed as a process of 
coordinating distributed internal (i.e., knowl- 
edge) and external representations (e.g., visual 
displays, manuals). Distributed cognition has 
two central points of inquiry, one that empha- 
sizes the inherently social and collaborative 
nature of cognition (e.g., doctors, nurses and 
technical support staff in neonatal care unit 
jointly contributing to a decision process), and 
one that characterizes the mediating effects of 
technology or other artifacts on cognition. 

The mediating role of technology can be 
evaluated at several levels of analysis from the 
individual to the organization. Technologies, 
whether they be computer-based or an arti- 
fact in another medium, transform the ways 
individuals and groups think. They do not 
merely augment, enhance, or expedite perfor- 


144 V. L. Patel and D. R. Kaufman 


mance, although a given technology may do 
all of these things. The difference is not merely 
one of quantitative change, but one that is 
qualitative in nature. 

In a distributed world, what becomes of 
the individual? We believe it is important to 
understand how technologies promote endur- 
ing changes in individuals. Salomon, Perkins 
and Globerson (1991) introduced an impor- 
tant distinction in considering the mediating 
role of technology on individual performance, 
the effects with technology and the effects of 
technology. The former is concerned with 
the changes in performance displayed by 
users while equipped with the technology. 
For example, when using an effective medical 
information system, physicians should be able 
to gather information more systematically 
and efficiently. In this capacity, medical infor- 
mation technologies may alleviate some of the 
cognitive load associated with a given task and 
permit physicians to focus on higher-order 


thinking skills, such as diagnostic hypoth- 
esis generation and evaluation. The effects 
of technology refer to enduring changes in 
general cognitive capacities (knowledge and 
skills) as a consequence of interaction with 
a technology. This effect is illustrated subse- 
quently in the context of the enduring effects 
of an EHR (see » Chap. 10). 

We employed a pen-based EHR system, 
DCI (Dossier of Clinical Information), in 
several of our studies (see Kushniruk et al. 
1996). Using the pen or computer keyboard, 
physicians can directly enter information into 
the EHR, such as the patient’s chief com- 
plaint, past history, history of present illness, 
laboratory tests, and differential diagnoses. 
Physicians were encouraged to use the sys- 
tem while collecting data from patients (e.g., 
during the interview). The system allows the 
physician to record information about the 
patient’s differential diagnosis, the ordering 
of tests, and the prescription of medication. 
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The graphical interface provides a highly 
structured set of resources for representing a 
clinical problem, as illustrated in © Fig. 4.10. 

We have studied the use of this EHR in 
both laboratory-based research (Kushniruk 
et al. 1996) and actual clinical settings using 
cognitive methods (Patel et al. 2000). The 
laboratory research included a simulated 
doctor-patient interview. We have observed 
two distinct patterns of EHR usage in the 
interactive condition, one in which the subject 
pursues information from the patient predi- 
cated on a hypothesis; the second strategy 
involves the use of the EHR display to guide 
asking the patient questions. In the screen- 
driven strategy, the clinician is using the struc- 
tured list of findings in the order in which they 
appear on the display to elicit information 
from the patient. All experienced users of this 
system appear to have both strategies in their 
repertoire. 

In general, a screen-driven strategy can 
enhance performance by reducing the cogni- 
tive load imposed by information-gathering 
goals and allow the physician to allocate more 
cognitive resources toward testing hypotheses 
and rendering decisions. On the other hand, 
this strategy can encourage a certain sense 
of complacency. We observed both effective 
as well as counter-productive uses of this 
screen-driven strategy. A more experienced 
user consciously used the strategy to structure 
the information-gathering process, whereas 
a novice user used it less discriminately. In 
employing this screen-driven strategy, the nov- 
ice elicited almost all of the relevant findings 
in a simulated patient encounter. However, 
she also elicited numerous irrelevant findings 
and pursued incorrect hypotheses. In this par- 
ticular case, the subject became too reliant on 
the technology and had difficulty imposing 
her own set of working hypotheses to guide 
the information-gathering and diagnostic- 
reasoning processes. 

The use of a screen-driven strategy is evi- 
dence of how technology transforms clinical 
cognition, as manifested in clinicians’ patterns 
of reasoning. Patel et al. (2000) extended this 
line of research to study the cognitive conse- 
quences of using the same EHR system in a 
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diabetes clinic. The study considered the fol- 
lowing questions (1) How do physicians man- 
age information flow when using an EHR 
system? (2) What are the differences in the way 
physicians organize and represent this infor- 
mation using paper-based and EHR systems, 
and (3) Are there long-term, enduring effects 
of the use of EHR systems on knowledge rep- 
resentations and clinical reasoning? One study 
focused on an in-depth characterization of 
changes in knowledge organization in a single 
subject as a function of using the system. The 
study first compared the contents and struc- 
ture of patient records produced by the physi- 
cian using the EHR system and paper-based 
patient records, using ten pairs of records 
matched for variables such as patient age and 
problem type. After having used the system 
for six months, the physician was asked to 
conduct his/her next five patient interviews 
using only hand-written paper records. 

The results indicated that the EHRs con- 
tained more information relevant to the diag- 
nostic hypotheses. In addition, the structure 
and content of information were found to 
correspond to the structured representa- 
tion of the particular medium. For example, 
EHRs were found to contain more informa- 
tion about the patient’s past medical history, 
reflecting the query structure of the interface. 
The paper-based records appear to better pre- 
serve the integrity of the time course of the 
evolution of the patient problem, whereas, this 
is notably absent from the EHR. Perhaps, the 
most striking finding is that, after having used 
the system for six months, the structure and 
content of the physician’s paper-based records 
bore a closer resemblance to the organization 
of information in the EHR than the paper- 
based records produced by the physician prior 
to exposure to the system. This finding is con- 
sistent with the enduring effects of technology 
even in the absence of the particular system 
(Salomon et al. 1991). The authors conclude 
that given these potentially enduring effects, 
the use of a particular EHR will almost cer- 
tainly have a direct effect on medical decision 
making. 

The previously discussed research dem- 
onstrates how information technologies can 
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mediate cognition and even produce endur- 
ing changes in how one performs a task. What 
dimensions of an interface contribute to such 
changes? What aspects of a display are more 
likely to facilitate efficient task performance, 
and what aspects are more likely to impede 
it? Norman (1986) argued that well-designed 
artifacts could reduce the need for users to 
remember large amounts of information, 
whereas poorly designed artifacts increased 
the knowledge demands on the user and the 
burden of working memory. In the distributed 
approach to HCI research, cognition is viewed 
as a process of coordinating distributed inter- 
nal and external representations, and this, in 
effect, constitutes an indivisible information- 
processing system. 

One of the appealing features of the dis- 
tributed cognition paradigm is that it can be 
used to understand how properties of objects 
on the screen (e.g., links, buttons) can serve 
as external representations and reduce cogni- 
tive load. The distributed resource model pro- 
posed by Wright, Fields, and Harrison (2000) 
addresses the question of “what information 
is required to carry out some task and where 
should it be located: as an interface object or 
as something that is mentally represented to 
the user.” The relative difference in the distri- 
bution of representations (internal and exter- 
nal) is central to determining the efficacy of 
a system designed to support a complex task. 
Wright, Fields, and Harrison (2000) were 
among the first to develop an explicit model 
for coding the kinds of resources available in 
the environment and how they are embodied 
on an interface. 

Horsky, Kaufman, and Patel (2003a, b) 
applied the distributed resource model and 
analysis to a provider order entry system. 
The goal was to analyze specific order-entry 
tasks such as those involved in admitting 
a patient to a hospital and then to identify 
areas of complexity that may impede optimal 
recorded entries. The research consisted of 
two-component analyses: a cognitive walk- 
through evaluation that was modified based 
on the distributed resource model and a sim- 
ulated clinical ordering task performed by 


seven physicians. The CW analysis revealed 
that the configuration of resources (e.g., very 
long menus, complexly configured displays) 
placed unnecessarily heavy cognitive demands 
on users, especially those who were new to 
the system. The resources model was also 
used to account for patterns of errors pro- 
duced by clinicians. The authors concluded 
that the redistribution and reconfiguration of 
resources might yield guiding principles and 
design solutions in the development of com- 
plex interactive systems. 

The distributed cognition framework has 
proved to be particularly useful in under- 
standing the performance of teams or groups 
of individuals in a particular work setting 
(Hutchins 1995). Hazlehurst and colleagues 
(Hazlehurst et al. 2003, 2007) have drawn on 
this framework to illuminate how work in 
healthcare settings is constituted using shared 
resources and representations. The activity 
system is the primary explanatory construct. 
It is comprised of actors and tools, together 
with shared understandings among actors 
that structure interactions in a work setting. 
The “propagation of representational states 
through activity systems” is used to explain 
cognitive behavior and investigate the organi- 
zation of the system and human performance. 
Following Hazlehurst et al. (2007, p. 540), “a 
representational state is a particular configu- 
ration of an information-bearing structure, 
such as a monitor display, a verbal utterance, 
or a printed label, that plays some functional 
role in a process within the system.” The 
author has used the concept to explain the 
process of medication ordering in an intensive 
care unit and the coordinated communica- 
tions of a surgical team in a heart room. 

The framework for distributed cognition 
is still an emerging one in human-computer 
interaction. It offers a novel and potentially 
powerful approach for illuminating the kinds 
of difficulties users encounter and find- 
ing ways to better structure the interaction 
by redistributing the resources. Distributed 
cognition analyses may also provide a win- 
dow into why technologies sometimes fail to 
reduce errors or even contribute to them. 
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4.6 Conclusion 

Theories and methods from cognitive science 
can shed light on a range of issues about the 
design and implementation of health informa- 
tion technologies. They can also serve an instru- 
mental role in understanding and enhancing 
the performance of clinicians and patients as 
they engage in a range of cognitive tasks related 
to health. We believe that fundamental studies 
in psychology and cognitive science in general, 
can provide general guiding principles to study 
these issues, and can be combined with field 
studies which serve to illuminate different fac- 
ets and contextualize the phenomena observed 
in laboratory studies. The potential scope of 
applied cognitive research in biomedical infor- 
matics is very broad. Significant inroads have 
been made in areas such as EHRs and patient 
safety. However, there are promising areas of 
future cognitive research that remain largely 
uncharted. These include understanding how 
to capitalize on health information technology 
without compromising patient safety (particu- 
larly in providing adequate decision support), 
understanding how various visual representa- 
tions/graphical forms mediate reasoning in 
biomedical informatics and how these repre- 
sentations can be used by patients and health 
consumers with varying degrees of literacy. 
These are only a few of the cognitive challenges 
related to harnessing the potential of cutting- 
edge technologies to improve patient safety. 


(e) Suggested Readings 

Anderson, J. R. (2015). Cognitive psychology and 
its implications. New York: Worth Publishers. 

Carayon, P., Alyousef, B., & Xie, A. (2012). 
Human factors and ergonomics in health care. 
In Handbook of human factors and ergo- 
nomics (pp. 1574-1595). 

Patel, V. L., Kaufman, D. R., & Kannampallil, 
T. G. (2013c). Diagnostic reasoning and deci- 
sion making in the context of health informa- 
tion technology. In D. Marrow (Ed.), Reviews 
of human factors and ergonomics (Vol. 8). 
Thousand Oaks, CA: SAGE Publications. 

Patel, V. L., Kaufman, D. R., & Arocha, J. F. 
(2002). Emerging paradigms of cognition in 
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medical decision-making. Journal of 
Biomedical Informatics, 35, 52-75. This rela- 
tively recent article summarizes new directions 
in decision-making research. The authors 
articulate a need for alternative paradigms for 
the study of medical decision making. 

Patel, V. L., Yoskowitz, N. A., Arocha, J. F., & 
Shortliffe, E. H. (2009). Cognitive and learning 
sciences in biomedical and health instructional 
design: A review with lessons for biomedical 
informatics education. Journal of Biomedical 
Informatics, 42(1), 176-197. A review of learn- 
ing and cognition with a particular focus on 
biomedical informatics. 

Patel, V. L., Kannampallil, T. G., & Shortliffe, 
E. H. (2015c). Role of cognition in generating 
and mitigating clinical errors. BMJ Quality & 
Safety, 24, 468-474. https://doi.org/10.1136/ 
bmjqs-2014-003482. 

Middleton, B., Bloomrosen, M., Dente, M. A., 
Hashmat, B., Koppel, R., Overhage, J. M., et al. 
(2013). Enhancing patient safety and quality of 
care by improving the usability of electronic 
health record systems: Recommendations from 
AMIA. Journal of the American Medical 
Informatics Association, 20(e1), e2-e8. 


(?) Questions for Discussion 

1. How can cognitive science theory 
meaningfully inform and shape 
design, development, and assessment 
of health-care information systems? 

2. Describe two or three kinds of mental 
representations and briefly characterize 
their significance in understanding 
human performance. 

3. What is the purpose and value of cogni- 
tive architectures? 

4. Identify three ways in which novices 
differ from experts in medicine. 

5. What are the limitations of interpreting 
retroactive data on medical errors? 

6. Explain the difference between latent 
and active failures and their implica- 
tions for patient safety? 

7. How does the field of Cognitive 
Informatics capture the interaction of 
cognition and informatics in biomedi- 
cine and healthcare? 
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8. Explain the role inductive, deductive 
and abductive reasoning play in medi- 
cal diagnostic reasoning? 

9. Explain some ways in which technology- 
mediated errors compromise 
patient safety. 

10. What are some of the assumptions of 
the distributed cognition framework? 
What implications does this approach 
have for the evaluation of electronic 
health records? 

11. Explain the difference between the 
effects of technology and the effects 
with technology? How can each of these 


can 


effects contribute to improving patient 
safety and reducing medical error? 

12. The use of electronic health records 
(EHR) has been shown to differentially 
affect clinical reasoning relative to 
paper charts. Briefly characterize the 
effects they have on reasoning, 
including those that persist after the 
clinician ceases to use the system. 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What are the major attributes of system 
usability? 

= What are the methods that can be used 
to evaluate usability of a health 
information system? 

= How does a poorly designed HIT 
implementation contribute to 
disruptions to clinical workflow? 


Introduction 
to Human-Computer 
Interaction 


5.1 


Human-computer interaction (HCI) is a mul- 
tifaceted discipline devoted to the study and 
practice of design and usability (Carroll 2003). 
The history of computing and more generally, 
that of artifact design, are rife with stories of 
dazzlingly powerful devices with remarkable 
capabilities that are thoroughly unusable by 
anyone except for the team of designers and 
their immediate families. In the often-cited 
book, Psychology of Everyday Things, 
Donald Norman (1988) describes a litany of 
poorly designed artifacts ranging from pro- 
grammable VCRs to answering machines and 
water faucets that are inherently non-intuitive 
and difficult to use. Similarly, there have been 
numerous innovative and promising clinical 
information technologies that have yielded 
decidedly suboptimal results and resulted in 
deep user dissatisfaction. At a minimum, dif- 
ficult interfaces result in steep learning curves 
and structural inefficiencies in task perfor- 
mance. At worst, problematic interfaces can 
have serious consequences for patient safety 
(Koppel et al. 2005; Lin et al. 1998; Zhang 
et al. 2004). 

Myers and Rosson (1992) reported that 
nearly 50% of software code was devoted to 
the user interface, and a survey of developers 
indicated that, on average, 6% of their proj- 
ect budgets were spent on usability evalua- 
tion. Given the complexities of the modern 
graphical user interfaces (GUI), it is likely 
that more than 50% of the code is now 


devoted to the GUI. On the other hand, 
usability evaluations have greatly increased 
over the last 20 years (Jaspers 2009). There 
have been numerous books and articles 
devoted to promoting effective user interface 
design (Preece et al. 2015; Shneiderman et al. 
2016), and the importance of enhancing the 
user experience has been widely acknowl- 
edged by both consumers and producers of 
information technology. Part of the impetus 
is that usability has been demonstrated to be 
highly cost effective. Karat (1994) reported 
that for every dollar a company invests in the 
usability of a product, it receives between $10 
and $100 in benefits. Although much has 
changed in the world of computing since 
Karat’s estimate (e.g., the flourishing of the 
World Wide Web and mobile apps), it is clear 
that investments in usability still yield sub- 
stantial rates of return (Nielsen 2008). It 
remains far costlier to fix a problem after 
product release than in an early design phase. 
The concept of usability as well as the meth- 
ods and tools to measure and promote it are 
now “touchstones in the culture of comput- 
ing” (Carroll 2003). 

HCI has spawned a professional orienta- 
tion that focuses on practical matters con- 
cerning the integration and evaluation of 
applications of technology to support human 
activities. There are also active academic HCI 
communities that have contributed significant 
advances to the science of computing. HCI 
researchers have been devoted to the develop- 
ment of innovative design concepts such as 
virtual reality, ubiquitous computing, multi- 
modal interfaces, collaborative workspaces, 
mobile technologies, and immersive and virtual 
environments. HCI research has been instru- 
mental in transforming the software engineer- 
ing process towards a more user-centered 
iterative system development (e.g., rapid pro- 
totyping). HCI research has also been focally 
concerned with the cognitive, social, and cul- 
tural dimensions of the computing experi- 
ence. In this regard, it is concerned with 
developing analytic frameworks for character- 
izing how technology can be used more pro- 
ductively across a range of tasks, settings, and 
user populations. 


Human-Computer Interaction, Usability, and Workflow 


In this chapter, we describe the founda- 
tions of the role of HCI in biomedical infor- 
matics with a specific focus on methods for 
usability evaluation and clinical workflow. We 
also discuss the implications of HCI and clin- 
ical workflow methods for future biomedical 
informatics research. This chapter is a com- 
panion to chapter 4 in this volume on cogni- 
tive informatics (Chap. 4) 


Role of HCI in Biomedical 
Informatics 


5.2 


HCI research in healthcare emerged at a time 
when health information technology and elec- 
tronic health records (EHRs) were becoming 
more central to the practice of medicine (Patel 
et al. 2015). Much HCI work has been devoted 
to creating or enhancing design in healthcare 
systems. However, the focus of most of our 
work has been on the cognitive mediation of 
technology in healthcare practice (Patel et al. 
2015). Most of the early HCI research focused 
on the solitary user of technology. Although 
such research is still commonplace, the focus 
has extended to distributed health information 
systems (Hazlehurst et al. 2007; Horsky et al. 
2003) and analysis of unintended sociotechni- 
cal consequences with a particular focus on 
computerized provider order entry systems 
(Koppel et al. 2005). HCI studies in biomedi- 
cine extend across clinical and consumer 
health informatics, addressing a range of user 
populations including providers, biomedical 
scientists, and patients. While the implications 
of HCI principles for the design of HIT are 
acknowledged, the adoption of the tools and 
techniques among clinicians, informatics 
researchers and developers of HIT are limited. 
There is a consensus that HIT has not realized 
its potential as a tool that facilitates clinical 
decision-making, coordination of care, and 
improvement of patient safety (Middleton 
et al. 2013). 

The field of human computer interaction 
intersects behavioral, and computer and infor- 
mation science. Thus, this field involves the 
study of interaction between people and com- 
puters. Computing systems includes both soft- 
ware and hardware. In addition, devices from 
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smartphones to glucose meters are devices that 
present usability challenges. In this chapter, 
the focus is on the software and the interface 
componants. Thus the major focus of HCI is 
with the evaluation of interactive computer 
systems for human use. In the healthcare envi- 
ronment, it is important to understand HCI to 
ensure the users and the computers interact 
successfully. Therefore, the goals of HCI are to 
deploy usable, useful and safe systems. 


5.3 Theoretical Foundations 

In recent years, there has been a significant 
growth in research and application regarding 
HCI and healthcare systems. They have pro- 
duced a collective body of experiential and 
practical knowledge about user experience, 
adoption and implementation to guide future 
design work. Some of the work is not specifi- 
cally guided by a theoretical foundation and 
these efforts have proven to be useful in eluci- 
dating problems and contributing to user- 
centered design efforts. Human-computer 
interaction work is at least partly an empirical 
science in which local knowledge derived from 
a small body of studies will suffice in solving a 
problem. However, it is also necessary that we 
extrapolate knowledge from one context to 
another. Concentrated efforts in HCI are time- 
consuming, tend to employ small numbers of 
subjects and are conducted in a limited num- 
ber of settings. For example, it is simply not 
possible to conduct an HCI research project in 
many different hospitals or to thoroughly test 
every facet of an electronic health record sys- 
tem. Knowledge solely based on practical 
experience or empirical studies are not ade- 
quate to account for the immense variety of 
health information technologies and the rich 
array of contexts that constitute the practice 
of medicine (Kaufman et al. 2015). 

There are many facets to technology use 
and a range of theories that address them. For 
example, the technology acceptance model 
that focuses on user’s perceived usefulness and 
usage intentions has been widely used in 
healthcare research (Venkatesh 2000). 
Sociotechnical systems theory is very broad in 
scope. It views all organizations as having the 
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following elements that comprise its organiza- 
tional design: technological (including the 
actual IT system, usability, and unintended 
consequences), social (doctors, staff, patients, 
etc.), and external environment (e.g., political, 
economic, cultural, and legal influences) 
(Hendrick and Kleiner 1999). These subsys- 
tems are intricately connected, such that 
changes to any one affects others, sometimes 
in unanticipated or dysfunctional ways (Aarts 
et al. 2007; Ash et al. 2004). One of the most 
influential theories in clinical informatics was 
offered by Sittig and Singh (2010). They 
proposed an 8-dimensional model of interre- 
lated concepts that can be used to explain per- 
formance in complex adaptive systems in the 
healthcare arena. The model has been applied 
in a range of settings model to understand 
and improve HIT applications at various 
stages of development and implementation. 
Cognitive engineering (CE) is an interdisci- 
plinary approach to the development of prin- 
ciples, methods, and tools to assess and guide 
the design of systems to support human perfor- 
mance (Hettinger et al. 2017). The approach is 
rooted in both cognitive science and engineer- 
ing and has been used to support design of dis- 
plays, decision support and training in 
numerous high-risk domains (Kushniruk et al. 
2004). A computational theory of mind pro- 
vides the fundamental underpinning for most 
contemporary cognitive theories. The basic 
premise is that much of human cognition can 
be characterized as a series of operations, com- 
putations on mental representations. At a 
higher level of cognitive analysis, CE also 
focuses on the discrepancy between user’s goals 
and the physical controls embodied in a system 
(Norman 1986). Interface design choices differ- 
entially mediates task performance and various 
methods of analysis including those described 
below endeavor to measure this impact. 
Distributed cognition (DCog) represents a 
shift in the study of cognition from an exclusive 
focus on the mind of the individual to being 
“stretched” across groups, material artifacts 
and cultures (Hutchins 1995). This paradigm 
has gained substantial currency in HCI 
research. In the distributed approach, cogni- 
tion is viewed as a process of coordinating dis- 
tributed internal (i.e., what’s in the mind) and 


external representations (e.g., visual displays, 
post-it notes). DCog has two lines of analysis, 
one that emphasizes the social and collabora- 
tive nature of cognition (e.g., surgeons, nurses 
and respiratory therapists in cardiothoracic 
surgical setting jointly contributing to a deci- 
sion process), and one that characterizes the 
mediating effects of technology (e.g., EHRs, 
paper charts, mobile devices, apps) or other 
artifacts on cognition. DCog constitutes a fam- 
ily of interrelated theories rather than a single 
approach (Cohen et al. 2006). The approaches 
collectively offer a penetrating view of the 
complexities embodied in human-computer 
interaction. However, there is no “off-the- 
shelf” methodology for using it in research or 
as a practitioner (Furniss et al. 2015). The 
application of DCog theory and methods are 
complicated by the fact that there are no set of 
features to attend to and no checklist or pre- 
scribed method to follow (Rogers 2012). In 
addition, the analysis and abstraction requires 
a high level of skill and training. More in-depth 
reviews of DCog can be found in (Rogers 2004) 
and as applied to healthcare in (Hazlehurst 
et al. 2008; Kaufman et al. 2015). DCog 
approaches have been particularly useful in the 
analysis of teamwork and EHR-mediated 
workflow in complex environments (Blandford 
and Furniss 2006; Hazlehurst et al. 2007; 
Kaufman et al. 2009). It is not unusual for HCI 
researchers to engage multiple theories depend- 
ing on the area of focus. 


5.4 Usability of Health 


Information Technology’ 


Theories of cognitive science meaningfully 
inform and shape design, development and 
assessment of health-care information sys- 
tems by providing insight into principles of 


1 Parts of the section, have been adapted, with permis- 
sion, from Kannampallil, T. G., & Abraham, J. 
(2015). Evaluation of health information technol- 
ogy: Methods, frameworks and challenges. In V. L. 
Patel, T. G. Kannampallil, & D. Kaufman (Eds.), 
Cognitive informatics in health and biomedicine: 
Human computer interaction. London: Springer. 
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O Fig.5.1 Classification of evaluation methods 
system usability and learnability, as well as the 
design of a safer workplace. 

Usability methods, most often drawn from 
cognitive science, have been used to evaluate a 
wide range of medical information technolo- 
gies including infusion pumps (Karat 1994), 
ventilator management systems, physician 
order entry (Ash et al. 2003; Horsky et al. 
2003; Koppel et al. 2005), pulmonary graph 
displays (Wachter et al. 2003), information 
retrieval systems, and research web environ- 
ments for clinicians (Elkin et al. 2002). In 
addition, usability techniques are increasingly 
used to assess patient-centered environments 
(Chan and Kaufman 2011; Cimino et al. 2000; 
Kaufman et al. 2003a, b). The methods 
include observations, focus groups, surveys 
and experiments. Collectively, these studies 
make a compelling case for the instrumental 
value of such research to improve efficiency, 
user acceptance and relatively seamless inte- 
gration with current workflow and practices. 

What do we mean by usability? Nielsen 
suggests that usability includes the following 
five attributes: (1) learnability: system should 
be relatively easy to learn, (2) efficiency: an 
experienced user can attain a high level of 
productivity, (3) memorability: features sup- 
ported by the system should be easy to retain 
once learned, (4) errors: system should be 
designed to minimize errors and support error 
detection and recovery, and (5) satisfaction: 
the user experience should be subjectively sat- 
isfying. 


Keystroke- 
level Models 


The question then becomes how we evalu- 
ate and study the various attributes of usabil- 
ity. We classified usability evaluation methods 
into two categories: analytic evaluation 
approaches and usability testing. Analytic 
evaluation studies use experts as partici- 
pants—usability experts, domain experts, 
software designers—or in some cases, are con- 
ducted without participants using task-ana- 
lytic, inspection-based or model-based 
approaches and are conducted in laboratory- 
based settings. 

We categorized usability testing into field- 
based studies that capture situated and con- 
textual aspects of HIT use, and a general 
category of methods (e.g., interviews, focus 
groups, surveys) that solicit user opinions and 
can be administered in different modes (e.g., 
face-to-face or online). A brief categorization 
of the evaluation approaches can be found in 
O Fig. 5.1. In the following sections, we pro- 
vide a detailed description of each of the eval- 
uation approaches along with research 
examples of its use. 


5.4.1 Analytical Approaches 


Analytical approaches rely on analysts’ judg- 
ments and analytic techniques to perform 
evaluations on user interfaces, and often do 
not directly involve the participation of end 
users. These approaches employ experts— 
general usability, human factors, or soft- 
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ware—for conducting the studies. In general, 
analytical evaluation techniques involve task- 
analytic approaches, inspection-based meth- 
ods, and predictive model-based methods (e.g., 
keystroke models, Fitts Law). 


5.4.1.1 Task Analysis? 


Task analysis is one of most commonly used 
techniques to evaluate “existing practices” in 
order to understand the rationale behind peo- 
ple’s goals of performing a task, the motiva- 
tions behind their goals, and how they perform 
these tasks (Preece et al. 1994). As described 
by Vicente (1999), task analysis is an evalua- 
tion of the “trajectories of behavior.” There 
are several variants of task analysis—hierar- 
chical task analysis (HTA) and cognitive task 
analysis (CTA) being the most commonly 
used in biomedical informatics research. 
HTA is the simplest task analytic approach 

and involves the breaking down of a task into 
sub-tasks and smaller constituted parts (e.g., 
sub-sub-tasks). The tasks are organized 
according to specific goals. This method, orig- 
inally designed to identify specific training 
needs, has been used extensively in the design 
and evaluation of interactive interfaces 
(Annett and Duncan 1967). The application 
of HTA can be explained with an example: 
consider the goal of printing a Microsoft 
Word document that is on your desktop. The 
sub-tasks for this goal would involve finding 
(or identifying) the document on your desk- 
top, and then print it by selecting the 
appropriate printer. The HTA for this task 
can be organized as follows: 
0. Print document on the desktop 
1. Go to the desktop 
2. Find the document 

2.1. Use “Search” function 

2.2. Enter the name of the document 

2.3. Identify the document 
. Open the document 
4. Select the “File” menu and then “Print” 


ww 


2 While GOMS (See » Sect. 5.4.1.3) is considered 
a task-analytic approach, we have categorized it 
as a model-based approach for predictions of task 
completion times. It is based on a task analytic 
decomposition of tasks. 


4.1. Select relevant printer 

4.2. Click “Print” button 
Plan 0: do 1-3-4; if file cannot be 
located by a visual search, do 2-3-4 
Plan 2: do 2.1-2.2-2.3 


In this task analysis, the task can be decom- 
posed as follows: moving to your desktop, 
searching for the document (either visually or 
by using the search function and typing in the 
search criteria), selecting the document, open- 
ing and printing it using the appropriate 
printer. The order in which these tasks are 
performed may change based on specific situ- 
ations. For example, if the document is not 
immediately visible on the desktop (or if the 
desktop has several documents making it 
impossible to identify the document visually), 
then a search function is necessary. Similarly, 
if there are multiple printer choices, then a rel- 
evant printer must be selected. The plans 
include a set of tasks that a user must under- 
take to achieve the goal (i.e., print the docu- 
ment). In this case, there are two plans: plan 0 
and plan 2 (all plans are conditional on tasks 
having pertinent sub-tasks associated with it). 
For example, if the user cannot find a docu- 
ment on the desktop, plan 2 is instantiated, 
where a search function is used to identify the 
document (steps 2.1,2.2 and 2.3). @ Figure 5.2 
depicts the visual form of the HTA for this 
particular example. 

HTA has been used in evaluating inter- 
faces and medical devices. For example, 
Chung et al. (2003) used HTA to compare the 
differences between six infusion pumps. Using 
HTA, they identified potential sources for the 
generation of human errors during various 
tasks. While exploratory, their use of HTA 
provided insights into how the HTA can be 
used for evaluating human performance and 
for predicting potential sources of errors. 
Alternatively, HTA has been used to model 
information and clinical workflow in ambula- 
tory clinics (Unertl et al. 2009). Unertl et al. 
(2009) used direct observations and semi- 
structured interviews to create a HTA of the 
workflows. The HTA was then used to iden- 
tify the gaps in existing HIT functionality for 
supporting clinical workflows, and the needs 
of chronic disease care providers. 
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Plan 0: do 1- 3-4; if file 
cannot be found do 2-3-4 


2. Find 
document 


1. Go to desktop 


4. Print 
document 


3. Open 
document 


Plan 2: do 2.1- 2.2-2.3 


2.1 Use search 
function 


2.2 Enter name 
of document 


2.3 Identify 
document 


O Fig. 5.2 Graphical representation of task analysis of printing a document: the tasks are represented in the 
boxes; the line underneath certain boxes represents the fact that there are no sub-tasks for these tasks 


CTA is an extension of the general task 
analysis technique to develop a comprehensive 
understanding regarding the knowledge, cog- 
nitive/thought processes and goals that under- 
lie observable task activities (Chipman et al. 
2000). Although the focus is on knowledge 
and cognitive components of the task activi- 
ties and performance, CTA relies on observ- 
able human activities to draw insights on the 
knowledge-based constraints and challenges 
that impair effective task performance. 

CTA techniques are broadly classified into 
three groups based: (a) interviews and obser- 
vations, (b) process tracing and (c) conceptual 
techniques (Cooke 1994). CTA using inter- 
views and observations involve developing a 
comprehensive understanding of tasks 
through discussions with, and task observa- 
tions of experts. For example, a researcher 
observes an expert physician performing the 
task of medication order entry into a CPOE 
(Computerized Physician Order Entry) sys- 
tem and asks to follow up questions regarding 
the specific aspects of the task. In a study on 
understanding providers’ management of 
abnormal test results, Hysong et al. (2010) 
conducted CTA-based interviews with 28 pri- 
mary care physicians on how and when they 
manage alerts, and how they use the various 
features on the EHR system to filter and sort 
their alerts. 

CTA supported by process-tracing 
approaches relies on capturing task activities 


through direct (e.g., verbal think aloud) or 
indirect (e.g., unobtrusive screen recording) 
data capture methods. Whereas the process- 
tracing approach is generally used to capture 
expert behaviors, it has also been used to eval- 
uate general users. In a study on experts’ 
information seeking behavior in critical care, 
Kannampallil et al. (2013a) used the process- 
tracing approach to identify the nature of 
these activities including the information 
sources, cognitive strategies, and shortcuts 
used by critical care physicians in decision- 
making tasks. The CTA approach relied on 
the verbalizations of physicians, their access 
to various sources, and the time spent on 
accessing these sources to identify the strate- 
gies of information seeking. 

Finally, CTA supported by conceptual 
techniques rely on the development of repre- 
sentations of a domain (and their related con- 
cepts) and the potential relationships between 
them. This approach is often used with experts 
and different methods are used for knowledge 
elicitation including concept elicitation, struc- 
tured interviews, ranking approaches, card 
sorting, structural approaches such as multi- 
dimensional scaling, and graphical associa- 
tions (Cooke 1994). 


5.4.1.2 Inspection-Based Evaluation 

Inspection methods involve experts appraising 
a system, playing the role of a user to identify 
potential usability and interaction issues with 
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a system. Inspection methods are often con- 
ducted on fully developed systems or inter- 
faces but may also be used for prototypes. 
Inspection methods rely on a usability expert, 
i.e., a person with significant training and 
experience in evaluating interfaces, to go 
through a system and identify whether the 
user interface elements conform to a pre- 
determined set of usability guidelines and 
design requirements (or principles). The most 
commonly used inspection methods are heu- 
ristic evaluations (HE) and walkthroughs. 

HE techniques utilize a small set of experts 
to evaluate a user interface (or a set of inter- 
faces in a system) based on their understand- 
ing of a set of heuristic principles regarding 
interface design (Johnson et al. 2005). This 
technique was developed by Jakob Nielsen 
and colleagues (Nielsen and Molich 1990), 
and has been used extensively in the evalua- 
tion of user interfaces. The original set of 
heuristics was developed by Nielsen based on 
an abstraction of 249 usability problems. In 
general, the following ten heuristic principles 
(or a subset of these) are most often consid- 
ered for HE studies: system status visibility; 
match between system and real world; user 
control and freedom; consistency and stan- 
dards; error prevention; recognition rather 
than recall; flexibility and efficiency of use; 
aesthetic and minimalist design; help users 
recognize, diagnose and recover from errors; 
and help and documentation (retrieved from: 
> http://www.nngroup.com/articles/ten- 
usability-heuristics/). Conducting an HE 
involves a usability expert going through an 
interface to identify potential violations to a 
set of usability principles (referred to as “heu- 
ristics”). These perceived violations could 
involve a variety of interface elements such as 
windows, menu items, links, navigation, and 
interaction. 

Evaluators typically select a relevant sub- 
set of heuristics for evaluation (or add more 
based on the specific needs and context). The 
selection of heuristics is based on the type of 
system and interface being evaluated. For 
example, the relevant heuristics for evaluating 
an EHR interface would be different from 
that of an app on a mobile device. After select- 
ing a set of applicable heuristics, one or more 


usability experts evaluate the user interface 
against the identified heuristics. After evaluat- 
ing the heuristics, the potential violations are 
rated according to a severity score (1-5, where 
1 indicates a cosmetic problem and 5 indicates 
a catastrophic problem). This process is itera- 
tive and continues until the expert feels that a 
majority (if not all) of the violations are iden- 
tified. It is also generally recommended that a 
set of 4-5 usability experts are required to 
identify 95% of the perceived violations or 
problems with a user interface. However, it is 
not uncommon to employ fewer experts (e.g., 
3). It should be acknowledged that the HE 
approach may not lead to the identification of 
all problems and the identified problems may 
be localized (i.e., specific to a particular inter- 
face in a system). An example of an HE evalu- 
ation form is shown in @ Fig. 5.3. 

In the healthcare domain, HE has been 
used in the evaluation of medical devices and 
HIT interfaces. For example, Zhang et al. 
(2003) used a modified set of 14 heuristics to 
compare the patient safety characteristics of 
two l-channel volumetric infusion pumps. 
Four independent usability experts evaluated 
both infusion pumps using the list of heuris- 
tics and identified 89 usability problems cate- 
gorized as 192 heuristic violations for pump 1, 
and 52 usability problems categorized as 121 
heuristic violations for pump 2. The heuristic 
violations were also classified based on their 
severity. In another study, Allen et al. (2006) 
developed a simplified list of heuristics to 
evaluate web-based healthcare interfaces 
(printouts of each interface). Multiple usabil- 
ity experts assigned severity ratings for each 
of the identified violations and the severity 
ratings were used to re-design the interface. 

Walkthroughs are another inspection- 
based approach that relies on experts to evalu- 
ate the cognitive processes of users performing 
a task. It involves employing a set of potential 
stakeholders (designers, usability experts) to 
characterize a sequence of actions and goals 
for completing a task. Most commonly used 
walkthrough, referred to as cognitive walk- 
through (CW), involves observing, recording 
and analyzing the actions and behaviors of 
users as they complete a scenario of use. CW 
is focused on identifying the usability and 
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1. Visibility of System Status 


The system should always keep user inform 
within reasonable time. 


out what is going on, through appropriate feedback 


I. Please check your response for the individual items related to usability factor: 


Response | Comments 


Does every screen have a title or heam 
that describes its contents? 


There is only one data entry screen and 
the more important issue is whether 


QO Yes yation at the top of the screen 
QNo Tguidg for what follows and it 
QNA dey 


». The title of 
"howe 


Heuristic 


Is there visual feedback in menus or 
dialog boxes about which choices are 


selectable? Specific considerations for the 
1.3 | Is there a clear indication of the current Yes i "visibility heuristic " 
location? UNo 2 
QNA 
1.4 | Is the menu-naming terminology O Yes 
consistent with the user's task domain? | Q No 
QNA £ 
1.5 | Does the system provide visibility: that The only state changes are Expert evaluation and 
is, by looking, the tell the state | Yes hiding/ lis joting t) . . * 
of the system and the altematives for | ANo | something has Soda View | comments (in italics) 
action? O NA Details” would be clearer than details. 
1.6 | Is there a consistent icon design O Yes 
scheme across the site? Q No 
Rating of severity of violations 
1.7 | Do GUI menus make obvious which O Yes 
item has been selected? Q No 
Q NA 


Usability Problem 


II. Please circle the overall severity rating for this usability factor: 
No Cosmetic Minor Major 
Usability Problem | Usability Problem 


0 | 1 


3 


III. If you have other comments, please specify. 


System visibility isn’t a big issue in this application because there aren't many state changes. My 


comments refer to the main data entry page and) 
caregiver. Does the following message ever go away? 


O Fig.5.3 Example of an HE form (for visibility) 


comprehensibility of a system (Polson et al. 
1992). The aim of CW is to investigate and 
determine whether the user’s knowledge and 
skills and the interface cues are sufficient to 
produce an appropriate goal-action sequence 
that is required to perform a given task 
(Kaufman et al. 2003a, b). CW is derived from 
the cognitive theory of how users work on 
computer-based tasks, using the exploratory 
learning approach, where system users con- 
tinually appraise their goals and evaluate their 
progress against these goals (Kahn and Prail 
1994). 

While performing CW, the focus is on sim- 
ulating the human-system interaction, and 
evaluating the fit between the system features 
and the user’s goals. Conducting CW studies 
involves multiple steps. Potential participants 
(e.g., users, designers, usability experts) are 
provided a set of task sequences or scenarios 


plan which I assume is for the 


for working with an interface or system. For 
example, for an interface for entering demo- 
graphic and patient history details, partici- 
pants (e.g., physicians) are asked to enter the 
age, gender, race and clinical history informa- 
tion. As the participants perform their 
assigned task, their task sequences, errors and 
other behavioral aspects are recorded. Often, 
follow up interviews or think aloud (described 
in a later section) are used to identify partici- 
pants’ interpretation of the tasks, how they 
make progress, and potential points of mis- 
matches in the system. Detailed observations 
and recordings of these mismatches are docu- 
mented for further analysis. While in most 
situations CWs are performed by individuals, 
sometimes groups of stakeholders perform 
the walkthrough together. For example, 
usability experts, designers and potential users 
could go through systems together to identify 
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the potential issues and drawbacks. Such 
group walkthroughs are often referred to as 
pluralistic walkthroughs. 

In biomedical informatics domain, it must 
be noted that CW has been used extensively in 
evaluating situations other than human- 
computer interaction. For example, the CW 
method (and its variants) has been used to 
evaluate diagnostic reasoning, decision- 
making processes and clinical activities. For 
example, Kushniruk et al. (1996) used the CW 
method to perform an early evaluation of the 
mediating role of HIT in clinical practice. The 
CW was not only used to identify usability 
problems but was instrumental in the develop- 
ment of a coding scheme for subsequent 
usability testing. Hewing et al. (2013) used 
CW to evaluate an expert ophthalmologist’s 
reasoning regarding retinal disease in infants. 
Using images, clinical experts were indepen- 
dently asked to rate the presence and severity 
of retinal disease and provide an explanation 
of how they arrived at their diagnostic deci- 
sions. Similar approaches were used by 
Kaufman et al. (2003a, b) to evaluate the 
usability of a home-based, telehealth system. 


5.4.1.3 Model-Based Evaluation 


Model-based evaluation approaches use pre- 
dictive modeling approaches to characterize 
the efficiency of user interfaces. Model-based 
approaches are often used for evaluating rou- 
tine, expert task performance. For example, 
how can the keys of a medical device interface 
be optimally organized such that the users can 
complete their tasks quickly and accurately? 
Similarly, predictive modeling can be used to 
compare the data entry efficiency between 
interfaces with different layouts and organiza- 
tion. We describe two commonly used predic- 
tive modeling techniques in the evaluation of 
interfaces. 

Card et al. (1980) proposed the GOMS 
(Goals, Operators, Methods and Selection 
Rules) analytical framework for predicting 
human performance with interactive systems. 
Specifically, GOMS models predict the time 
taken to complete a task by a skilled/expert 
user based on “the composite of actions of 
retrieving plans from long-term memory, 


choosing among alternative available meth- 
ods depending on features of the task at hand, 
keeping track of what has been done and what 
needs to be done, and executing the motor 
movements necessary for the keyboard and 
mouse” (Olson and Olson 2003). In other 
words, GOMS assumes that the execution of 
tasks can be represented as a serial sequence 
of cognitive operations and motor actions. 

GOMS is used to describe an aggregate of 
the task and the user’s knowledge regarding 
how to perform the task. This is expressed 
regarding the Goals, Operators, Methods and 
Selection rules. Goals are the expected out- 
comes that a user wants to achieve. For exam- 
ple, a goal for a physician could be 
documenting the details of a patient interac- 
tion on an EHR interface. Operators are the 
specific actions that can be performed on the 
user interface. For example, clicking on a text 
box or selecting a patient from a list in a drop- 
down menu. Methods are sequential combina- 
tions of operators and sub-goals that need to 
be achieved. For example, in the case of select- 
ing a patient from a dropdown list, the user 
has to move the mouse over to the dropdown 
menu, click on the arrow using the appropri- 
ate mouse key to retrieve the list of patients. 
Finally, selection rules are used to ascertain 
which methods to choose when several choices 
are available. For example, using the arrow 
keys on the keyboard to scroll down a list ver- 
sus using the mouse to select. 

One of the simplest and most commonly 
used GOMS approaches is the Keystroke- 
Level Model (KLM), which was first described 
in Card et al. (1983). As opposed to the gen- 
eral GOMS model, the KLM makes several 
assumptions regarding the task. In KLM, 
methods are limited to keystroke level opera- 
tions and task duration is predicted based on 
these estimates. For the KLM, there are six 
types of operators: K for pressing a key; P for 
pointing the mouse to a target; H for moving 
hands to the keyboard or pointing device; D 
for drawing a line segment; M for mental 
preparation for an action; and R for system 
response. Based on experimental data or other 
predictive models (e.g., Fitts Law), each of 
these operators is assigned a value or a param- 
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eterized estimate of execution time. We 
describe an example from Saitwal et al. (2010) 
on the use of the KLM approach. 

In a study investigating the usability of 
EHR interfaces, Saitwal et al. (2010) used the 
KLM approach to evaluate the time is taken, 
and the number of steps required to complete 
a set of 14 EHR-based tasks. The purpose of 
the study was to characterize the issues with 
the user interface and also to identify poten- 
tial areas for improvement. The evaluation 
was performed on the AHLTA (Armed Forces 
Health Longitudinal Technology Application) 
user interface. A set of 14 prototypical tasks 
was first identified. Sample tasks included 
entering the patient’s current illness, history 
of present illness, social history and family 
history. KLM analysis was performed on each 
of the tasks: this involved breaking each of 
the tasks into its component goals, operators, 
methods and selection rules. The operators 
were also categorized as physical (e.g., move 
the mouse to a button) or mental (e.g., locate 
an item from a dropdown menu). For exam- 
ple, the selection of a patient name involved 
eight steps (M — mental operation; P — physi- 
cal operation): (1) think of location on the 
menu [M, 1.25], (2) move hand to the mouse 
[P, 0.45], (3) move the mouse to “Go” in the 
menu [P, 0.4s], (4) extend the mouse to 
“Patient” [P, 0.45], (5) retrieve the name of the 
patient [M, 1.25], (6) locate patient name on 
the list [M, 1.25], (7) move mouse to the identi- 
fied patient [P, 0.4] and (8) click on the iden- 
tified patient [P, 0.45]. In this case, there were 
a total of 8 steps that would take 5.25 to com- 
plete. Similarly, the number of steps and the 
time taken for each of the 14 considered 
AHLTA tasks were computed. 

In addition, GOMS and its family of 
methods can be productively used to make 
comparisons regarding the efficiency of per- 
forming tasks interfaces. However, such 
approaches are approximations and have sev- 
eral disadvantages. Although GOMS provides 
a flexible and often reliable mechanism for 
predicting human performance in a variety of 
computer-based tasks, there are several poten- 
tial limitations. A brief summary is provided 
here, and interested readers can find further 
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details in Card et al. (1980). GOMS models 
can be applied only to the error-free, routine 
tasks of skilled users. Hence, it is not possible 
to make time predictions for non-skilled users, 
who are likely to take considerable time to 
learn to use a new system. For example, the 
use of the GOMS approach to predict the 
potential time spent by physicians in using a 
new EHR would be inaccurate owing to rela- 
tive lack of knowledge of the physicians 
regarding the use of the various interfaces, 
and the learning curve required to be up-to- 
speed with the new system. The complexity of 
clinical work processes and tasks, and the 
variability of the user population create sig- 
nificant challenges for the effective use of 
GOMS in measuring the effectiveness of clini- 
cal tasks. 

Fitts Law is used to predict human motor 
behavior; it is used to predict the time taken to 
acquire a target (Fitts 1954). On computer- 
based interfaces, it has been used to develop a 
predictive model of time it takes to acquire a 
target using a mouse (or another pointing 
device). The time taken to acquire a target 
depends on the distance between the pointer 
and target (referred to as amplitude, A) and 
the width of the target (W). The movement 
time (MT) is mathematically represented as 
follows: 


A 
MT = k.log, (4 + ) 


where k is a constant, A — amplitude, W — 
width of the target. 

In summary, based on Fitts law, one can 
say that the larger objects are easier to acquire 
while smaller, closely aligned objects are much 
more difficult to acquire with a pointing 
device. While the direct application of Fitts 
law is not often found in the evaluation stud- 
ies of HIT or health interfaces in general, it 
has a profound influence in the design of 
interfaces. For example, the placement of 
menu items and buttons, such that a user can 
easily click on them for selection, are based on 
Fitts law parameters. Similarly, in the design 
of number keypads for medical devices, the 
size of the buttons and their location can be 
effectively predicted by Fitts law parameters. 
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In addition to the above-mentioned pre- 
dictive models, there are several other less 
common models. While a detailed description 
of each of them or their use is beyond the 
scope of this chapter, we provide a brief intro- 
duction to another predictive approach: Hick- 
Hyman choice reaction time (Hick 1951; 
Hyman 1953). Choice reaction time, RT, can 
be predicted based on the number of available 
stimuli (or choices), n: 


RT =a+b.log, (n) 


where a and b are constants. 

Hick-Hyman law is particularly useful in 
predicting text entry rates for different key- 
boards (MacKenzie et al. 1999), and time 
required to select from different menus (e.g., a 
linear vs. a hierarchical menu). In particular, 
the method is useful to make decisions regard- 
ing the design and evaluation of menus. For 
example, consider two menu design choices: 9 
items deep/3 items wide and 3 items deep/9 
items wide. The RT for each of these can be 
calculated as follows: (3 x (a + b.log, 
(n)) < 9 x (a + b.log, (n)). This shows that the 
access to menus is more efficient when it is 
designed breadth-wise rather than depth-wise. 


5.5 Usability Testing 


and User-Based Evaluation 


In this section, we have grouped a range of 
approaches that are generally used for evalu- 
ating the usability of HIT systems. In general, 
we have classified them into field/observa- 
tional studies and general approaches for 
usability evaluation that can be utilized in 
both field and laboratory settings. While for- 
mal usability testing is often conducted in 
laboratory settings where user performance 
(and other selected variables) are evaluated 
based on pre-selected tasks, we have loosely 
classified the evaluation techniques that uti- 
lize users in the evaluation process into gen- 
eral approaches (those that can be used in 
both field and laboratory-based studies) and 
field studies. 


5.5.1 Interviews and Focus Groups 


Interviews and focus groups are commonly 
used to elicit information about opinions and 
perspectives of participants and their work 
practices (Mason 2002). Interviews are viewed 
as an approach to elicit additional informa- 
tion and are often used in concert with other 
field study methods (e.g., observation or 
shadowing). 

Individual interviews can be classified into 
three major categories based on the format 
and level of standardization of the interview 
questions — structured, semi-structured and 
narrative (or unstructured). During structured 
interviews, all interviewees are asked the same 
questions in the same order. This allows for 
comparisons between responses across inter- 
viewees, which can be analyzed using qualita- 
tive and quantitative methods. Semi-structured 
interviews are flexible and allow for probing of 
participants (i.e., with follow up questions) to 
discuss relevant issues (Denton et al. 2018). 

Focus group is a type of interactive inter- 
viewing method that involves an in-depth dis- 
cussion of a particular topic of interest with a 
small group of participants. Focus group 
method has been described as “a carefully 
planned discussion designed to obtain percep- 
tions on a defined area of interest in a permis- 
sive, non-threatening environment” (Krueger 
2009). The central elements of focus groups as 
highlighted by Vaughn et al. (1996) include: 
(a) the group is an informal assembly of target 
participants to discuss a topic; (b) the group is 
small, between 6 to 12 members and is rela- 
tively homogeneous; (c) the group conversa- 
tion is facilitated by a trained moderator with 
prepared questions and probes; and (d) given 
that the primary goal of a focus group is to 
elicit the perceptions, feelings, attitudes, and 
ideas of participants about a selected topic, it 
can be used to generate hypotheses for further 
research (Krueger 2009). 

Unlike individual interviews, focus group 
discussions allow the researcher to probe 
responses to a particular research topic while 
capturing the underlying group dynamics of 
the participants. According to Kitzinger 
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(1995), interaction is the crucial feature of 
focus groups because the interaction between 
participants views the group as a single unit 
and also captures their view of the world, the 
language they use about an issue and their 
values and beliefs about a situation (Gibbs 
1997). For instance, a focus group involving 
usability experts, system designers and care 
providers can allow participants to share their 
varying perspectives on system design based 
on their work role. This will enable them to 
voice the key issues on the fit or (lack thereof) 
between the functionalities of the system and 
the clinical workflow. 

Another important factor that playsa vital 
role in focus group sessions is the presence of 
a skilled moderator (or facilitator) (Burrows 
and Kendall 1997) who manages the conver- 
sations and interactions between participants. 
Moreover, scheduling a convenient time and 
location for administering focus group inter- 
views can be very difficult, given the number 
of participants that are involved. 


5.5.2 Verbal Think Aloud 


Verbal think aloud (or simply “think aloud”) 
is used to capture rich verbal data on the 
thought processes that underlie human 
actions. Analysis of these verbal reports can 
be used to characterize the underlying infor- 
mation and knowledge structures. Think 
aloud evaluations are generally characterized 
into two types: (1) concurrent and (2) retro- 
spective (Ericsson and Simon 1980). A con- 
current think aloud requires uninterrupted 
and direct verbalizations of participants as 
they perform a task, and is considered to be 
complete and consistent with their thought 
sequence. In contrast, a retrospective think 
aloud requires the researcher to ask and 
prompt subjects to recall their thought 
sequence while performing a task (or after 
completing a task). Ericsson and Simon 
(1984), the original proponents of the verbal 
think aloud method, suggested the value of 
think aloud data is based on the following 
assumptions: (1) the verbalizations capture 
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only a subset of the cognitive processes 
underlying behavior; (2) human mind is an 
information processor; and (3) the verbaliza- 
tions capture contents of working memory 
(i.e, information recently acquired is 
accessed). 

Think aloud studies are typically con- 
ducted to identify and characterize cognitive 
processes such as reasoning, problem solving, 
and decision-making processes. For example, 
Patel and colleagues (Patel et al. 1994, 2001; 
Patel and Groen 199la, b) have conducted 
several studies using verbal think aloud that 
investigated the nature of reasoning using 
electronic tools, its effects of expertise and 
decision-making. Most of these studies relied 
on verbalizations by a participant (e.g., a phy- 
sician), and in-depth linguistic analysis of the 
verbalizations to identify inherent strategies 
in their reasoning and decision-making. 
Similarly, Fonteyn and Grobe (1994) utilized 
a think aloud study to understand the reason- 
ing and decision-making behaviors of critical 
care nurses regarding unstable patients. 
Insights on the reasoning process of expert 
nurses informed the design of an expert sys- 
tem. Other examples of similar key studies 
can be found here (Fisher and Fonteyn 1995; 
Fowler 1997; Funkesson et al. 2007; Grobe 
et al. 1991; Simmons et al. 2003). 

One of the concerns that have been raised 
in evaluation studies using verbal think aloud 
method is the issue of sample size. While 
many researchers have used a small sample 
size of five participants to focus on in-depth 
analysis of the cognitive processes, others 
have critiqued the sample size (e.g., Lewis 
1994). Lundgren-Laine and Salanterä (2010) 
have suggested that the characteristics of the 
study participants in terms of their verbaliza- 
tion skills and the appropriate application of 
the think aloud is more important than the 
sample size (Caulton 2001; Fonteyn et al. 
1993; Hall et al. 2004). Measures of informa- 
tion and participant saturation are often used 
to determine study completion. A detailed 
description of the think aloud method and 
approaches for its analysis can be found here 
(Ericsson and Simon 1984). 
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5.5.3 Usability Surveys 
and Questionnaires 


Surveys and questionnaires are widely used in 
usability evaluation studies. Their widespread 
use is related to ease of administration (through 
multiple modes: online, face-to-face) and lim- 
ited time required to complete (especially those 
that use Likert scale measures). In terms of 
usability evaluation, there are several surveys 
that are commonly used. A list of the commonly 
used usability surveys are provided below: 

(a) QUIS (Questionnaire for User Interface 

Satisfaction: > http://lap.umd.edu/quis/): 

measure user interface interaction and 

subjective satisfaction; 

SUMI (Software Usability Measurement 

Inventory: » http://sumi.ucc.ie/): assess 

usability of software; 

(c) PSSUO (Post-Study System Usability 

Questionnaire), and ASO (After Scenario 

Questionnaire: » http://hcibib.org/perl- 

man/question.cgi?form=ASQ) (Lewis 

1991): address global usability of a system 

along with specific scenarios of use; 

SUS (System Usability Scale — » http:// 

www.usability.gov/how-to-and-tools/ 

methods/system-usability-scale. html) 

(Brooke 1996): a general survey of system 

usability; 

(e) Subjective workload assessment (NASA- 
TLX Workload Instrument: » http:// 
humansystems.arc.nasa.gov/groups/tlx/ 
paperpencil.html) (Hart and Staveland 
1988): a multi-item scale to determine the 
physical, temporal, mental, effort, frustra- 
tion and performance while working with 
interfaces. 


(b 


wm 


(d 


wa 


Although most of the above-mentioned sur- 
veys are validated for their reliability, research- 
ers often use a variety of self-created surveys 
and questionnaires. Questionnaires, as opposed 
to the surveys that use a specific scale (e.g., a 
scale of 1-7), often use open-ended questions 
to elicit responses from participants regarding 
system use (e.g., “Describe some of the chal- 
lenges that you faced while using the system?”). 

Surveys are often used along with other 
data collection methods and are considered a 


complementary data collection method in 
HIT evaluation. For example, Karahoca and 
colleagues (Karahoca et al. 2010) used a 
generic survey along with system usage logs to 
characterize the usability of two mobile device 
prototypes. Similar open-ended question- 
naires along with additional observational 
data was used by Holzinger and colleagues 
(Holzinger et al. 2011) to characterize patient 
interactions with a mobile interface. Dalai 
and colleagues (Dalai et al. 2014) used the 
SUS scale and the NASA-TLX scales for 
comparing the effectiveness of two interfaces 
for comprehending psychiatric clinical narra- 
tives. These survey scales were used in concert 
with an analysis of verbal reports to evaluate 
the effectiveness of presented interfaces. 


5.5.4 Field/Observational 
Approaches 


In contrast to the analytic evaluation tech- 
niques that often yield objective data, there 
are several qualitative approaches that focus 
on the subjective and contextual assessments 
of system design and user interactions within 
the context of a real work environment (Assila 
et al. 2014). These qualitative approaches are 
generally categorized as ethnographic-based 
methods and require an “immersion” in the 
field in order to understand the experiences 
and practices of the informants (Schatzberg 
2008). Although field and observational meth- 
ods are more central to human factors, they 
have also played an instrumental role in 
understanding how health information tech- 
nologies mediate a range of decision-making, 
coordination and associated patient care 
activities. The next section provides examples 
of observational research in clinical settings 
within the context of clinical workflow and 
usability. 


5.6 Clinical Workflow 

Workflow is a “set of tasks grouped into 
chronologically ordered processes, plus the 
people and resources required to complete the 
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tasks and accomplish a desired goal” (Unertl 
et al. 2010). It is widely believed that work- 
flow analysis is essential for ensuring success- 
ful design and implementation of health IT 
(Harrington 2015; Schumacher and Lowry 
2010; Xie and Carayon 2015). Electronic 
health record workflow is a subset of work- 
flow activities that are mediated by EHRs and 
related technologies (Harrington 2015). These 
activities are not viewed as discrete, but rather 
are embedded in a situational context and 
broader workflow. Clinical workflow, espe- 
cially in high velocity clinical settings, is char- 
acterized by perpetual change, multiple 
providers with varying levels of communica- 
tion, and a high volume of workload, multi- 
tasking, and interruptions (Harrington 2015). 
Workflow is further complicated by the need 
to negotiate complex and nonintuitive sys- 
tems. When EHRs are well integrated into 
clinical workflow, it increased the likelihood 
of positive healthcare outcomes and dimin- 
ished error rates (Carayon et al. 2010; Lau 
et al. 2012). In contrast, when health IT was 
not well integrated into clinical workflow in a 
way that supports clinicians’ cognitive work, 
it can compromise patient safety (Carayon 
et al. 2010). 

Clinical workflow has been extensively 
studied over the course of the last 20 years. A 
cursory search of the term “clinical workflow” 
in Google Scholar yields more than 16,000 
articles and more than 8000 since 2014. The 
scope of workflow research is rather expan- 
sive incorporating analysis of individuals, 
work environments, human-system interfaces 
and organizational factors. There has also 
been a focus on workflow as a mediator of 
patient safety (Carayon et al. 2014; Middleton 
et al. 2013). The emphasis of this section is on 
EHR workflow, which names the subset of 
workflow mediated by EHRs and other health 
IT (Zheng et al. 2010). EHRs are known to 
lack flexibility and resist easy modification 
and are relatively independent of context. 
However, clinical workflow is variable and 
context dependent, and tends to resist “one- 
size-fits-all” solutions. Studies on EHR- 
mediated workflow tend to focus on 
task-performance of the individual clinician, 
for example, (1) the impact of usability on 
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workflow (ref), (2) time and motion studies 
that endeavor to quantify how clinicians allo- 
cate their time, or (3) the role of the EHR in 
the coordination of care, for example, exam- 
ining the ways the systems serve as either a 
facilitator or impediment to the delivery of 
care (Weir et al. 2011). 

Although EHR has conferred significant 
advantages such as decision support (Ben- 
Assuli et al. 2015) and improved clinical note 
quality (Burke et al. 2014), it has also contrib- 
uted to an onerous documentation burden 
which impacts workflow. According to 
AMIA’s EHR 2020 Task Force’s report, 
AMIA's report on the EHR 2020 Task Force’s, 
clinician’s time investment in patient care doc- 
umentation has doubled in the last 20 years 
(Payne et al. 2015). A large scale survey of cli- 
nicians affiliated with the American College 
of Physicians found that clinicians reported a 
loss of time (relative to paper-based records) 
of 4 hours per week (McDonald et al. 2014). 
The authors concluded that this could 
decrease access and increase the cost of care. 
However, findings varied significantly across 
studies. Researchers have employed a range of 
methods to document burden including log 
files and time on task studies. Hripcsak et al. 
used audit logs to perform a detailed analysis 
of time spent reviewing and documenting 
clinical notes (Hripcsak et al. 2011). They 
found significant variation among clinicians 
with a range of 20-100 minutes documenting 
and 7 minutes to 56 reviewing notes. They also 
noted that a significant percent of notes (e.g., 
38% of nursing notes) were never read by any- 
one. This impacted communication, for exam- 
ple, the transfer of information from nurses to 
physicians. In a recent study, Collins and col- 
league used log files to study flowsheet docu- 
mentation at a highly granular level (Collins 
et al. 2018). They found that clinicians (mostly 
nurses) manually entered between 600-900 
data points. The authors argue for the need 
for better automated device integration. 

In summary, there is heterogeneity in find- 
ings in relation to documentation. Some can 
be attributable to differences specific to set- 
tings, for example, as reflected in the use of 
scribes and in relation to the execution of 
Meaningful Use mandates. Other differences 
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are reflected in the study methods. However, it 
is quite clear that EHRs typically add signifi- 
cantly to the documentation burden and that 
has resulted in frustration among users. 

EHR usability problems and their impact 
on workflow are well documented. For exam- 
ple, the Healthcare Information and 
Management Systems Society (HIMSS) sur- 
vey found that workflow was clinicians’ num- 
ber one EHR usability “pain point” (Ribitzky 
et al. 2010). Respondents also reported frus- 
tration due to numerous alerts and difficulties 
with navigation resulting from the need to 
negotiate too many displays in the EHR to 
access information (Carayon et al. 2010). As 
referred to previously, Saitwal and colleagues 
(Saitwal et al. 2010) employed a cognitive task 
analytic approach to evaluate an EHR inter- 
face and quantified interactive behavior (e.g., 
task duration, number of steps). Challenges 
of employing the user interface were: (a) large 
number of average total steps to complete 
routine tasks, (b) slow execution time, and (c) 
overall mental workload. Similarly, Carayon 
et al. (6) summarized a range of usability 
issues in their systematic review of the work- 
flow literature, including the large number of 
mouse clicks to complete a task, difficulties in 
navigating between many screens to input and 
retrieve information and cluttered screens. 
These usability problems serve to increase 
cognitive load. Information overload, a form 
of cognitive load, is a problem that occurs 
when attentional, perceptual and cognitive 
capacity is exceeded by the quantity of data 
presented via an interface to the extent that 
errors occur in users’ information processing 
(Zahabi et al. 2015). 

Cognitive overload can be partially attrib- 
uted to poor user design. Small interface dif- 
ferences can be consequential and have a 
significant impact on task efficiency. 
According to Gray and colleagues, interactive 
behavior is constrained by the design and con- 
figuration of displays “as well as by the ways 
in which elementary cognitive, perceptual, 
and motor operations can be combined” 
(Gray and Boehm-Davis 2000). Poorly config- 
ured interfaces can increase navigational com- 
plexity. Navigation in this context refers to 


route taken to complete a task including the 
action steps and the trajectory through space 
(e.g., sequence of tabs or display screens). In a 
systematic review, Roman et al. found that 
navigation actions (e.g., scrolling through a 
patient list) were often linked to specific 
usability heuristic violations, including flexi- 
bility and efficiency of use, and lack of an 
emphasis on recognition rather than recall 
(Roman et al. 2017). 

Medication reconciliation (MedRec) tools 
are an essential part of a strategy to reduce 
medication errors and prevent adverse events 
(Agrawal 2009). MedRec tools enable clini- 
cians to compare lists of medications in 
patients history and revise the lists so that 
they are up-to-date and accurate. There have 
been several recent studies that have applied 
cognitive methods of analysis to MedRec 
(Boockvar et al. 2011; Lesselroth et al. 2013). 
Horsky and colleagues et al. investigate the 
accuracy of two different medication recon- 
ciliation tools integrated into EHRs in a simu- 
lated study (Horsky et al. 2017). They found 
that the reconciled records were significantly 
more accurate when clinicians used the sec- 
ond tool. Specifically, the comparison showed 
clinicians made three times as many errors in 
EHRs with single column medication lists, as 
compared to using side-by-side lists. The 
authors concluded that the better outcome 
using the second tool was strongly facilitated 
by a design that was more effective in support- 
ing a cognitively demanding task. The system 
made less demands on working memory than 
the first. Plaisant and colleagues similarly 
contrasted a conventional interface design 
(control) with a novel prototype (Twinlist) 
(Plaisant et al. 2015). The Twinlist interface 
divided information into five columns, while 
the control used two side-by-side lists. 
Evaluation showed that in Twinlist partici- 
pants completed MedRec significantly faster 
and more accurately than the control. Both 
studies demonstrated the comparative advan- 
tage of having access to needed information 
on a single screen as opposed to having to 
toggle between two screens or tabs. 

The studies described above employed 
simulated methods. Duncan compared 
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MedRec interfaces in three Mayo Clinic cam- 
puses in which different EHR systems were 
used (Duncan et al. 2018b). Although they 
support the same set of functions, the inter- 
faces differed in important respects. 
Specifically, the steps to access the medication 
list and perform the addition of a last dose 
differ. System 2 necessitated a three-step pro- 
cess as opposed to a single step needed to 
execute the same reconciliation goal (system 
1). The systems were compared using a pre- 
dictive model (KLM) and with live observa- 
tions captured to video. As described earlier 
in this chapter, the keystroke-level model 
(KLM) is a widely used analysis where execu- 
tion time of routine tasks (performed without 
errors) are estimated. Duncan found that the 
KLM estimates for the MedRec task closely 
approximated the observations. Both meth- 
ods found that the time required to reconcile 
a single medication was more than 2 seconds 
greater for system 2, which also required 
more mouse clicks and screen transitions. The 
two systems differ in terms of interactive 
complexity and demands on working mem- 
ory. The difference highlights the importance 
of emphasizing recognition rather than recall 
to minimize the memory load on clinicians. 
Duncan and colleagues employed a similar 
approach with the same EHR systems in rela- 
tion to a vital signs documentation and medi- 
cation administration record tasks (Duncan 
et al. 2018a; Duncan et al., 2020). Vital signs 
are used to gauge a patient’s hemodynamic 
stability and, in this case, provide a point of 
reference prior to surgery. The objectives were 
to: (1) analyze aspects of vital signs charting 
interfaces and determine how these aspects 
differentially mediate task performance and 
(2) investigate variations in vital signs docu- 
mentation across clinical sites. Analyses 
revealed that accessing displays and the orga- 
nization of interface elements are often unin- 
tuitive and inefficient, creating unnecessary 
complexities when interacting with the sys- 
tem. The study documented the ways in which 
the systems differed in their modes of interac- 
tion, organization of patient information and 
cognitive support. The authors noted that 
identifying barriers to interface usability and 
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bottlenecks in EHR-mediated workflow can 
lead to system redesigns that minimize cogni- 
tive load and may also improve patient safety 
and efficiency. 


5.7 Future Directions 

The role of HCI, and more specifically, usabil- 
ity, is likely to be more embedded within the 
biomedical informatics research paradigm 
over the next decade. This is primarily because 
of the role health information technologies 
play in transforming the practice of medicine. 
Such a transformation has led to the wide- 
spread use of technology in patient care (e.g., 
the use of EHRs) and the development of 
patient-facing applications for self-manage- 
ment of care (e.g., mobile devices). As a result, 
the role of usability is likely to have an 
increased focus in potentially two areas: devel- 
opment of new approaches for unobtrusive 
evaluation of user interactions, and in the 
evaluation of consumer-facing applications 
for health. 

As previously described, usability and user 
interaction studies are expensive, in terms of 
time, and effort that are involved. Recent efforts 
on usability evaluation have focused on utiliz- 
ing logs of user interactions—including audit 
trails, and other unobtrusively collected inter- 
action data (e.g., key stroke or eye-tracker). One 
classic example of such data is the EHR-based 
audit logs. In a recent study on EHR use in an 
emergency department, Kannampallil and col- 
leagues used user log files to track the physician 
interactions during patient care activities 
(Kannampallil et al. 2018b). Similar efforts on 
tracing and modeling interaction and naviga- 
tion behaviors are ongoing. The availability of 
more powerful tools for data capture and anal- 
ysis will create new opportunities for computer 
scientists, psychologists and clinicians to col- 
laborate on HCI-related investigations of tech- 
nology-mediated clinical practice. In a recent 
study of such a collaboration, Vankipuram and 
colleagues showed that the process of data- 
driven iterative workflow redesign using visual- 
ization of overlaid data from quantitative and 
qualitative sources could be used to identify 
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inefficiencies and bottlenecks in clinical work- 
flow and potentially contribute to process 
improvement (Vankipuram et al., 2019). 

The growth of wearable devices and 
mobile sensing technologies have led to the 
development and use of a large number of 
consumer-facing applications and tools. 
Although much of these are on mobile 
devices, other patient-facing applications such 
as portals and social networking tools have 
gained prominence in recent years. These 
applications are rapidly evolving and, in many 
cases, still require significant improvements 
for translation into a usable and sufficiently 
robust product. 


5.8 Conclusion 

The implementation and broad use of health 
information technologies have grown rapidly 
over the course of the last decade. Clinical 
applications and increasingly patient-facing 
systems are beginning to transform healthcare 
practices. They offer significant potential for 
transforming the quality of patient care as 
well as enabling patient to become agents of 
change in managing their own heath. Usability 
challenges continue to provide significant 
impediments to productive use of technology 
and efficient workflow. The focus of this chap- 
ter has been on methods of usability evalua- 
tion. There is a wealth of different methods 
available for researchers and professional 
usability analysts to deploy in view to opti- 
mize the user experience. The methods have 
contributed to a growing body of knowledge 
to inform user-centered design and best prac- 
tices across the range of health technologies. 


(e) Suggested Readings 

Carroll, J. M. (2003). HCI models, theories, and 
frameworks: Toward a multidisciplinary sci- 
ence. San Francisco: Morgan Kaufmann. An 
edited volume on the theoretical foundations 
of HCI. 

Norman, D. A. (1993). Things that make us 
smart: Defending human attributes in the age 
of the machine. Reading: Addison-Wesley 


Pub. Co. This book addresses significant issues 
in human-computer interaction in a very read- 
able and entertaining fashion. 

Patel, V. L., Kannampallil, T., & Kaufman, D. 
(Eds.). (2015). Cognitive informatics in health 
and biomedicine: Human computer interac- 
tion. London: Springer. This edited book 
addresses the key gaps on the applicability of 
theories, models and evaluation frameworks 
of HCI and human factors for research in bio- 
medical Informatics. 

Preece, J., Rogers, Y., & Sharp, H. (2015). 
Interaction design: Beyond human-computer 
interaction (4th ed.). West Sussex: Wiley. A 
very readable and relatively comprehensive 
introduction to human-computer interaction. 
A new edition will be available in April, 2019. 

Zheng, K., Westbrook, J., Kannampallil, T., & 
Patel, V. L. (Eds.). (2018). Cognitive informat- 
ics: Reengineering clinical workflow for more 
efficient and safer care. London: Springer. 
This edited book offers a comprehensive 
aspect of clinical workflow, supported by the 
theoretical, methodological, empirical, and 
pragmatic perspectives from experts in the 
field. 


(?) Questions for Discussion 

1. What role do the theories of HCI and 
cognitive science play in providing insight 
into principles of system usability, as well 
as the design of a safer workplace? 

2. A large urban hospital is planning to 
implement a provider order entry system. 
You have been asked to advise them on 
system usability and to study the cogni- 
tive effects of the system on performance. 
Discuss the issues involved and suggests 
some of the steps you would take to study 
system usability. 

3. What are the primary differences between 
analytic usability evaluation methods and 
usability testing? 

4. What are some of the considerations for 
choosing analytic approaches for usability 
evaluation? 

5. How does usability impact clinical work- 
flow? How can we provide better cognitive 
support for workflow? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What key functions do software 
applications perform in health care? 

= How are the components of the 
software development lifecycle applied 
to health care? 

= What are the trade-offs between 
purchasing commercial, off-the-shelf 
systems and developing custom 
applications? 

= What are important considerations in 
comparing commercial software 
products? 

= Why do systems in health care, both 
internally-developed and commercial 
purchased, require continued software 
development? 


6.1 How Can a Computer System 
Help in Health Care? 


In this chapter, we focus on the software 
applications and components of health care 
information systems, and describe how they 
are used and applied to support health care 
delivery. We give examples of some basic 
functions that may be performed by health 
information systems, and discuss important 
considerations in how the software may be 
acquired, implemented and used. This under- 
standing of how a system gets put to use in 
health care settings will help as you read 
about the various specific applications in the 
chapters that follow. 

Health care is an information-intensive 
field. Clinicians are constantly gathering, 
reviewing, analyzing and communicating 
information from many sources to make 
decisions. Humans are complex, and individ- 
uals have many different characteristics that 
are relevant to health care and that need to be 
considered in decision-making. Health care 
is also complex, with a huge body of existing 
knowledge that is expanding at an ever- 
increasing rate. Software for managing health 
information is intended to facilitate the use 
of this information at various points in the 


care delivery process. Software can determine 
the ways by which data are obtained, orga- 
nized and processed to yield information. 
Software, in terms of design, development, 
acquisition, configuration and maintenance, 
is therefore a major component of the field 
of biomedical informatics. This chapter pro- 
vides an introduction to some of the practi- 
cal considerations regarding health 
information software, including both general 
software engineering principles, as well as the 
application of these principles to health care 
settings. 

To this aim, we first describe the major 
software functions within a health care envi- 
ronment or health information system. 
While not all functions can be covered in 
detail, some specific examples are given to 
indicate the breadth of software applica- 
tions as well as to provide an understanding 
of their relevance. We also describe the soft- 
ware development life cycle, with specific 
applications to health care. We then describe 
important considerations and strategies for 
acquiring and implementing software in 
health care settings. Finally, we discuss 
emerging trends influencing software engi- 
neering related to health information sys- 
tems. Each system can be considered in 
regard to what it would take to make it func- 
tional in a health care system, and what 
advantages and disadvantages the software 
may have, based on how it was created and 
implemented. Understanding these princi- 
ples will help you identify the risks and ben- 
efits of various applications, so that you can 
identify how to optimize the positive impact 
of health information systems. 


Software Functions 
in Health Care 


6.2 


6.2.1 Case Study of Health Care 
Software 


The following case study illustrates many 

important functions of health care software. 
John Miller is a 42-year old man living in a 

medium-sized U.S. city. He is married and has 
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two children. He has type 2 diabetes, but it is 
currently well controlled and he has no other 
health concerns. There is some history of car- 
diovascular disease in his family. John has a pri- 
mary care physician, Linda Stark, who practices 
at a clinic that is part of a larger health delivery 
network, Generation Healthcare System 
(GHS). GHS includes a physician group, pri- 
mary and specialty care clinics, a tertiary care 
hospital and an affiliated health insurance plan. 

John needs to make an appointment with 
Dr. Stark. He logs into the GHS patient portal 
and uses an online scheduling application to 
request an appointment. While in the patient 
portal, John also reviews results from his most 
recent visit and prints a copy of his current 
medication list in order to discuss the addition 
of an over-the-counter supplement he recently 
started taking. 

Before John arrives for his visit, the clinic’s 
scheduling system has already alerted the staff 
of John’s appointment and the need to collect 
information related to his diabetes. Upon his 
arrival, Dr. Stark’s nurse gathers the requested 
diabetes information and other vital signs data 
and enters these into the electronic health 
record (EHR). In the exam room, Dr. Stark 
reviews John’s history, the new information 
gathered during this visit, and recommenda- 
tions and reminders provided by the EHR on a 
report tailored to her patient’s medical history. 
They both go over John’s medication list and 
Dr. Stark notes that, according to the EHR’s 
drug-drug interaction tool, the supplement he is 
taking may have an interaction with one of his 
diabetes medications. One of the reminders 
suggests that John is due for a hemoglobin Alc 
(HbAlc) test, and Dr. Stark orders this in the 
EHR. Dr. Stark’s nurse, who has been notified 
of the lab test order, draws a blood sample from 
John. Before the appointment ends, Dr. Stark 
completes and signs the clinic note and forwards 
a visit summary for John to review on the 
patient portal. 

A few days after his appointment, John 
receives an email from GHS that notifies him of 
an important piece of new information in his 
patient record. Logging into the patient portal 
application, John sees that his HbAlc test is 
back. The test indicates that the result is ele- 
vated. Dr. Stark has added a note to the result 
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saying that she has reviewed the lab and would 
like to refer John to the GHS Diabetes Specialty 
Clinic for additional follow-up. John uses the 
messaging feature in the patient portal to 
respond to Dr. Stark and arrange for an 
appointment. John also clicks on an infobutton 
next to the lab result to obtain more informa- 
tion about the abnormal value. He is linked to 
patient-focused material about HbAlc testing, 
common causes for elevated results, and ways 
this might be addressed. Lastly, John reviews 
the visit summary note from his appointment 
with Dr. Stark to remind him about suggestions 
she had for replacing his supplement. 

At his appointment with the Diabetes 
Specialty Clinic, John notes that they have 
access to all the information in his record. A 
diabetes care manager, Maria, reviews the 
important aspects of John’s medical history. 
She suggests more frequent monitoring of his 
laboratory test results and evaluating whether 
he is able to control his diabetes without changes 
to his medications. Maria highlights diet and 
exercise suggestions in his patient portal record 
that have been shown to help similar patients. 
When the visit is complete, Maria sends an 
electronic summary of the visit to Dr. Stark. 

A year later, John is experiencing greater 
difficulty controlling his diabetes. Dr. Stark and 
Maria have continued to actively monitor his 
HbAlc and other laboratory test results, and 
occasionally make changes to his treatment 
regimen. They use the EHR to visualize labora- 
tory test results and correlate them with changes 
in medications. Due to a variety of personal and 
financial challenges, John struggles with adher- 
ence to his medication regimen, and he is not 
maintaining a healthy diet. As a result, his 
blood sugar has become seriously unstable, and 
the population health management module of 
the EHR flags John for urgent evaluation due to 
a dangerously high home blood glucose reading. 
Maria confirms the reading with John, collects 
additional information about his health status, 
and escalates the issue to Dr. Stark. Dr. Stark 
then recommends John go to the GHS hospital 
emergency department (ED) for urgent evalua- 
tion. Doctors in the ED access John’s electronic 
record including his medication and lab history, 
as well as notes from Dr. Stark and Maria, 
which help them quickly assess his condition 
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and develop a treatment plan. John is admitted 
to the hospital, and physicians, nurses, and oth- 
ers caring for him access his longitudinal medi- 
cal records and document new observations and 
treatments. They are also able to electronically 
reconcile his outpatient prescriptions with his 
inpatient medications to ensure continuity. 
After a brief hospital stay, John is stabilized 
and ready to be discharged, with an updated list 
of medications. 

Because Dr. Stark is listed as John’s pri- 
mary care physician, she is notified electroni- 
cally of both the hospital admission and 
discharge. She reviews his discharge summary 
in the EHR and instructs her staff to send a 
message through the patient portal to John, to 
let him know she reviewed his inpatient record 
and to schedule a follow-up appointment. 

The GMS EHR is also part of a statewide 
health information exchange (HIE), which 
allows medical records to be easily shared with 
health care providers outside the GMS system. 
This means that if John should need to visit a 
hospital, emergency department or specialty 
care clinic outside the GMS network, his record 
would be available for review and any informa- 
tion entered by these outside providers would be 
similarly available to Dr. Stark and others 
within the GMS network. The local and state 
health departments where John lives are also 
linked to the HIE. This allows clinics, hospitals 
and labs to electronically submit information to 
the health departments for disease surveillance 
and case reporting purposes. 

Back at home, John’s wife, Gina, is able to 
view his medical records on the GHS patient 
portal because he has granted her proxy access 
to his account. This allows her to see notes from 
Dr. Stark and schedule appointments. Gina also 
views the hospital discharge instructions that 
were electronically sent to John’s patient record. 
As she reviews the information about diabetes 
that GHS had automatically linked to John’s 
record, Gina sees a notification about a clinical 
research study involving genetic links with dia- 
betes. Concerned about their two children, Gina 
discusses the study with John, and they review 
more online materials about the study. 
Interested in the possible benefits of the 
research, John electronically volunteers to par- 
ticipate in the study, and he is later contacted 


by a study coordinator. Because GHS investiga- 
tors are conducting the study, relevant parts of 
John’s EHR are easily shared with the clinical 
trials management system. 

This fictional case study highlights many of 
the current opportunities for improving health 
care delivery, including improved access to 
care, increased patient engagement, shared 
patient-provider decision-making, better care 
management, medication reconciliation, 
improved transitions of care, population 
health management, and research recruitment. 
In the case study, each of these goals required 
software to make health information accessible 
to the correct individuals at the proper time. 

In today’s health care system, few individ- 
uals enjoy the interaction with software 
depicted in the John Miller case study. 
Although the functions described in the sce- 
nario exist at varying levels of maturity, most 
health care delivery institutions have not con- 
nected all the functions together as described. 
The current role of software engineering in 
health care is therefore two-fold: to design and 
implement software applications that provide 
required functions, and to connect these func- 
tions in a seamless experience for both the cli- 
nicians and the patients. 

The case study highlights the usefulness of 
several functions provided by health care soft- 
ware applications for clinicians, patients, and 
administrators. Some of these functions 
include: 

1. Acquiring and storing data 

2. Summarizing and displaying data 

3. Facilitating communication and informa- 
tion exchange 

4. Generating alerts, reminders, and other 
forms of decision support 

5. Supporting educational, research, and 
public health initiatives 


6.2.2 Acquiring and Storing Data 


The amount of data needed to describe the 
health and health care of even a single person 
is huge. Health professionals require assis- 
tance with data acquisition to deal with the 
data that must be collected and processed. 
One of the first uses of computers in a medi- 
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cal setting was the automatic analysis of 
blood specimens and other body fluids by 
instruments that measure chemical concentra- 
tions or that count cells and organisms. These 
systems generated printed or electronic results 
to health care workers and identified values 
that were outside normal limits. Computer- 
based patient monitoring that collected physi- 
ological data directly from patients were 
another early application of computing tech- 
nology (see > Chap. 19). These systems pro- 
vided frequent, consistent collection of vital 
signs, electrocardiograms (ECGs), and other 
indicators of patient status. More recently, 
researchers have developed medical imaging 
applications as described in > Chaps. 9 and 
20, including computed tomography (CT), 
magnetic resonance imaging (MRI), and digi- 
tal subtraction angiography. The calculations 
for these computationally intensive applica- 
tions cannot be performed manually; comput- 
ers are required to collect and manipulate 
millions of individual observations. 

Early computer-based medical instru- 
ments and measurement devices provided 
results only to human beings. Today, most 
instruments can transmit data directly into the 
EHR, although the interfaces can still be awk- 
ward and poorly standardized (see > Chaps. 5 
and 8). Computer-based systems that acquire 
information directly from patients are also 
data-acquisition systems; they free health pro- 
fessionals from the need to collect and enter 
demographic and health history information. 

Various departments within a hospital use 
computer systems to store clinical data. For 
instance, clinical laboratories use information 
systems to keep track of orders and specimens 
and to report test results; most pharmacy and 
radiology departments use computers to per- 
form analogous functions. Their systems may 
connect to outside services (e.g., pharmacy 
systems are typically connected to one or 
more drug distributors so that ordering and 
delivery are rapid and local inventories can be 
kept small). By automating processing in 
areas such as these, health care facilities are 
able to provide efficient service, reduce labor 
costs, and minimize errors. 
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6.2.3 Summarizing 
and Displaying Data 


Computers are well suited to performing 
tedious and repetitive data-processing tasks, 
such as collecting and tabulating data, com- 
bining related data, and formatting and pro- 
ducing reports. They are particularly useful 
for processing large volumes of data. 

Data acquired by computer systems can be 
detailed and voluminous. Data analysis sys- 
tems must aid decision makers by reducing 
and presenting the intrinsic information in a 
clear and understandable form. Software can 
be used to create useful visualizations that 
facilitate trend analysis and compute second- 
ary parameters (e.g., means, standard devia- 
tions, rates of change) to help identify 
abnormalities. Clinical research systems have 
modules for performing powerful statistical 
analyses over large sets of patient data. When 
employing such tools, research investigators 
should have insight into the methods being 
used. For clinicians, graphical displays are 
useful for interpreting data and identifying 
trends. 

Fast retrieval of information is essential 
for any computer system. Data must be well 
organized and indexed so that information 
recorded in an EHR system can be easily 
retrieved. Here the variety of users must be 
considered. Obtaining recent information 
about a patient entering the office differs from 
the needs that a research investigator will have 
in accessing the same data. The query inter- 
faces provided by EHRs and clinical research 
systems assist researchers in retrieving perti- 
nent records from the huge volume of patient 
information. Recently, there has been increas- 
ing industry adoption of the Health Level 7 
International (HL7) Fast Healthcare 
Interoperability Resources (FHIR) standard 
for sharing data on a patient-by-patient basis. 
The FHIR standard is being adapted for 
population-level data sharing through the 
FHIR Bulk Data initiative. As discussed in 
> Chap. 21, bibliographic retrieval systems 
are also an essential component of health 
information services. 
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6.2.4 Facilitating Communication 
and Information Exchange 


In hospitals and other large-scale health care 
institutions, myriad data are collected by 
multiple health professionals who work in a 
variety of settings; each patient receives care 
from a host of providers—nurses, physicians, 
technicians, pharmacists, and so on. 
Communication among the members of the 
team is essential for effective health care deliv- 
ery. Data must be available to decision makers 
when and where they are needed, independent 
of when and where they were obtained. 
Computers help by storing, transmitting, 
sharing, and displaying those data. As 
described in > Chaps. 2 and 12, the patient 
record is the primary vehicle for communica- 
tion of clinical information. The limitation of 
the traditional paper-based patient record is 
the concentration of information in a single 
location, which prohibits simultaneous entry 
and access by multiple people. Hospital infor- 
mation systems (HISs; see » Chap. 13) and 
EHR systems (> Chap. 12) allow distribution 
of many activities, such as admission, appoint- 
ment, and resource scheduling; review of lab- 
oratory test results; and inspection of patient 
records to the appropriate sites. 

Information necessary for specific decision- 
making tasks is rarely available within a single 
computer system. Clinical systems are 
installed and updated when needed, available, 
and affordable. Furthermore, in many institu- 
tions, inpatient, outpatient, and financial 
activities are supported by separate organiza- 
tional units. Patient treatment decisions 
require inpatient and outpatient information. 
Hospital administrators must integrate clini- 
cal and financial information to analyze costs 
and to evaluate the efficiency of health care 
delivery. Similarly, clinicians may need to 
review data collected at other health care insti- 
tutions, or they may wish to consult published 
biomedical information. Communication net- 
works that permit sharing of information 
among independent computers and geograph- 
ically distributed sites are now widely avail- 
able. Actual integration of the information 
they contain requires additional software, 


adherence to standards, and operational staff 
to keep it all working as technology and 
systems evolve. 


6.2.5 Generating Alerts, 
Reminders, and Other Forms 
of Decision Support 


In the end, all the functions of storing, dis- 
playing and transmitting data support 
decision-making by health professionals, 
patients, and their caregivers. The distinction 
between decision-support systems and sys- 
tems that monitor events and issue alerts is 
not clear-cut; the two differ primarily in the 
degree to which they interpret data and rec- 
ommend patient-specific action. Perhaps the 
best-known examples of decision-support 
systems are the clinical consultation systems 
or event-monitoring systems that use popula- 
tion statistics or encode expert knowledge to 
assist physicians in diagnosis and treatment 
planning (see >» Chap. 22). Similarly, some 
nursing information systems help nurses to 
evaluate the needs of individual patients and 
thus assist their users in allocating nursing 
resources. ® Chapter 22 discusses systems 
that use algorithmic, statistical, or artificial- 
intelligence (AI) techniques to provide advice 
about patient care. 

Timely reactions to data are crucial for 
quality in health care, especially when a 
patient has unexpected problems. Data over- 
load, created by the ubiquity of information 
technology, is as detrimental to good decision 
making as is data insufficiency. Data indicat- 
ing a need for action may be available but are 
easily overlooked by overloaded health pro- 
fessionals. Surveillance and monitoring sys- 
tems can help people cope with all the data 
relevant to patient management by calling 
attention to significant events or situations, 
for example, by reminding doctors of the need 
to order screening tests and other preventive 
measures (see > Chaps. 12 and 22) or by 
warning them when a dangerous event or con- 
stellation of events has occurred. 

Laboratory systems routinely identify and 
flag abnormal test results. Similarly, when 
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patient-monitoring systems in intensive care 
units detect abnormalities in patient status, 
they sound alarms to alert nurses and physi- 
cians to potentially dangerous changes. A 
pharmacy system that maintains computer- 
based drug-profile records for patients can 
screen incoming drug orders and warn physi- 
cians who order a drug that interacts with 
another drug that the patient is receiving or a 
drug to which the patient has a known allergy 
or sensitivity. By correlating data from multi- 
ple sources, an integrated clinical information 
system can monitor for complex events, such 
as interactions among patient diagnosis, drug 
regimen, and physiological status (indicated 
by laboratory test results). For instance, a 
change in cholesterol level can be due to pred- 
nisone given to an arthritic patient and may 
not indicate a dietary problem. 


6.2.6 Supporting Educational, 
Research, and Population 
and Public Health Initiatives 


Rapid growth in biomedical knowledge and in 
the complexity of therapy management has 
produced an environment in which students 
cannot learn all they need to know during 
training—they must learn how to learn and 
must make a lifelong educational commit- 
ment. Today, physicians and nurses have avail- 
able a broad selection of computer programs 
designed to help them to acquire and main- 
tain the knowledge and skills they need to 
care for their patients. The simplest programs 
are of the drill-and-practice variety; more 
sophisticated programs can help students to 
learn complex problem-solving skills, such as 
diagnosis and therapy management (see 
> Chap. 21). Computer-aided instruction 
provides a valuable means by which health 
professionals can gain experience and learn 
from mistakes without endangering actual 
patients. Clinical decision-support systems 
and other systems that can explain their rec- 
ommendations also perform an educational 
function. In the context of real patient cases, 
they can suggest actions and explain the rea- 
sons for those actions. 
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As health care increasingly shifts to a 
mode of care based on population health 
rather than episodic health care transactions, 
there is increasing need for information sys- 
tems to monitor and manage individuals’ 
health outside the context of clinical visits. 
Surveillance also extends beyond the health 
care setting. Appearances of new infectious 
diseases, unexpected reactions to new medica- 
tions, and environmental effects should be 
monitored. Thus the issue of data integration 
has a national or global scope (see the discus- 
sion of the National Health Information 
Infrastructure in > Chaps. | and 16 that deals 
with public health informatics). 


6.3 Software Development 


and Engineering 


Clearly, software can be used in many differ- 
ent ways to manage and manipulate health 
information to facilitate health care delivery. 
However, just using a computer or a software 
program does not improve care. If critical 
information is unavailable, or if processes are 
not organized to operate smoothly, a com- 
puter program will only expose challenges 
and waste time of clinical staff that could be 
better applied in delivering care. To be useful, 
software must be developed with an under- 
standing of its role in the care setting, geared 
to the specific functions that are required, and 
developed correctly. To be used, software 
must be integrated to support the users’ 
workflow. We will discuss both aspects of 
software engineering — development and 
integration. 


6.3.1 Software Development 


Software development can be a complex, 
resource-intensive undertaking, particularly 
in environments like health care where safety 
and security provide added risk. The software 
development life cycle (SDLC) is a framework 
imposed over software development in order 
to better ensure a repeatable, predictable pro- 
cess that controls cost and improves quality of 
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the software product (usually an application). 
SDLC is a subset of the systems development 
life cycle, focusing on the software component 
of a larger system. In practice, and particu- 
larly in heath care, software development 
encompasses more than just the software, 
often stretching into areas such as process re- 
engineering in order to maximize the benefits 
of the software product. Although SDLC 
most literally applies to an in-house develop- 
ment project, all or most of the life cycle 
framework is also relevant to shared develop- 
ment and even purchase of commercial off- 
the-shelf (COTS) software. The following is 
an overview of the phases of the SDLC. 


6.3.1.1 Planning/Analysis 

The software development life cycle begins 
with the formation of a project goal during 
the planning phase. This goal typically derives 
from an organization’s or department’s mis- 
sion/vision, focusing on a particularly need or 
outcome. This is sometimes called project 
conceptualization. Planning includes some 
initial scoping of the project as well as resource 
identification (including funding). It is impor- 
tant to address what is not included in the 
project in order to create appropriate expecta- 
tions for the final product. A detailed analysis 
of current processes and needs of the target 
users is often done. As part of the analysis, 
specific user requirements are gathered. 
Depending on the development process, this 
might include either detailed instructions on 
specific functions and operating parameters 
or more general user stories that explain in 
simple narrative the needs, expected workflow 
and outcomes for the software. It is important 
that users of the system are consulted, as well 
as those in the organization who will imple- 
ment and maintain the software. The decision 
of whether to develop the software in-house, 
partner with a developer, or purchase a ven- 
dor system will likely determine the level of 
detail needed in the requirements. Vendors 
will want very specific requirements that allow 
them to properly scope and price their work. 
The requirements document will usually 
become part of a contract with a vendor and 
will be used to determine if the final product 
meets the agreed specification for the soft- 


ware. In-house development can have less 
detailed requirements, as the contract to build 
the software is with the organization itself, 
and can allow some evolution of the require- 
ments as the project progresses. However, the 
more flexibility that is allowed and the longer 
changes or enhancements are permitted, the 
higher the likelihood of “scope creep” causing 
schedule and cost overruns. 

Other tasks performed during analysis 
include an examination of existing products 
and potential alternative solutions, and, par- 
ticularly for large projects, a cost/benefit anal- 
ysis. A significant, and frequently overlooked, 
aspect of the planning and analysis phase is to 
determine outcome measures that can be used 
during the life cycle to demonstrate progress 
and evaluate success or failure of the project. 
These measures can be refined and details 
added as the project progresses. The planning 
and analysis phase typically ends when a deci- 
sion to proceed is made, along with at least a 
rough plan of how to implement the next 
steps in the SDLC. If the organization decides 
to purchase a solution, a request for proposals 
(RFP) that contains the requirements docu- 
ment is released to the vendor community. 

The planning and analysis stage of soft- 
ware development is perhaps both the most 
difficult and the most important stage in the 
development lifecycle as it is applied to health 
care. Requirements for software in health care 
are inherently difficult to define for many rea- 
sons. Health care practice is constantly chang- 
ing, and as new therapies or approaches are 
discovered and validated, these new 
advancements can change the way care is 
delivered. In addition, the end-users of health 
care software are comparatively advanced 
relative to other industries. Unlike industries 
where front-line workers may be directed by 
supervisors with more advanced training and 
greater flexibility in decision making, in health 
care the front-line workers are often physi- 
cians, who are often the most highly-trained 
workers in the system (although not necessar- 
ily the most advanced with respect to com- 
puter literacy) and require the greatest 
flexibility for decisions. This flexibility makes 
it difficult to define workflows or even get 
indications of the workflows being followed, 
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since physicians will not always make explicit 
what actions or plans are being pursued. This 
flexibility is important for patient care, 
because it allows front-line clinicians to adapt 
appropriately to different settings, staffing lev- 
els, and specialties. The need for flexibility is 
such that defining requirements for software 
that could reduce flexibility is criticized as 
“cookbook” medicine, constituting a com- 
mon reason for resistance to software adop- 
tion. However, this resistance is not just 
characteristic of software — clinical guidelines 
and other approaches to structured or formal- 
ized care processes can also be criticized, and 
the challenge of applying discovered- 
knowledge to clinical care processes remains 
difficult. 

Over time, however, there have been some 
successful efforts that have defined standard 
requirements for health information software. 
Among the most notable efforts have been in 
EHRs, where organizations have created lists 
of requirements and certified systems that 
match those requirements. The Certification 
Commission for Health Information 
Technology (CCHIT) began in 2004 and 
defined criteria for electronic health records’ 
functionality, interoperability and security 
(Leavitt and Gallagher 2006). Later, the certi- 
fication approach was adopted by the Office 
of the National Coordinator of Health 
Information Technology (ONC) in 2010, 
when a list of EHR functions that were most 
related to “meaningful use” of EHRs was 
established (Blumenthal and Tavenner 2010). 
Such “meaningful use” of EHRs came with 
significant financial incentives administered 
by the Centers for Medicare and Medicaid 
Services (CMS), leading to a rapid increase in 
the adoption of EHRs meeting these require- 
ments (Washington et al. 2017). 


6.3.1.2 Design 

During the Design phase, potential software 
solutions are explored. System architectures 
are examined for their abilities to meet the 
needs stated in the requirements. Data storage 
and interface technologies are assessed for 
appropriate fit. User front-end solutions are 
investigated to assess capabilities for required 
user input and data display functions. Other 
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details, such as security, performance, and 
internationalization are also addressed during 
design. Analysts with domain knowledge in 
the target environment are often employed 
during this phase in order to translate user 
requirements into suitable proposals. Simple 
mock-ups of the proposed system may be 
developed, particularly for user-facing com- 
ponents, in order to validate the design and 
identify potential problems and missing infor- 
mation. Closely related to this, an integrated, 
automated testing architecture, with appro- 
priate testing scripts/procedures, may be 
designed in this phase in order to ensure the 
software being developed meets quality stan- 
dards and is responsive to the requirements. 
The depth and completeness of the design is 
contingent on the software development pro- 
cess, as well as other factors. In some cases, 
the entire design is completed before moving 
on to software coding. In other development 
strategies, a high-level system architecture is 
designed but the details of the software com- 
ponents are delayed until each component or 
component feature is being created. The pros 
and cons of these approaches are discussed 
later in this chapter. For vendor-developed 
systems, the purchasing organization will 
often hold design reviews and demonstrations 
of mock-ups or prototypes with the vendor to 
assess the solutions. In the case of COTS soft- 
ware, the purchasing organization relies on 
the vendor’s system description and reviews 
from third parties, supplemented by system 
demonstrations, to determine the appropri- 
ateness of the design. As with the Analysis 
phase, it is important to include the target 
users and IT operations personnel in the 
design reviews. 

Ideally, the software could be designed 
solely around the care requirements and the 
use of information. However, rarely are the 
clinical requirements of the use case the only 
consideration. In the design phase, other 
requirements are considered, such as the soft- 
ware cost and how it integrates with an exist- 
ing health IT strategy of an organization. 
Resources applied to a development project 
are not available for other potential projects, 
so costs are always influential. The design 
phase must consider various alternatives to 


186 A. B. Wilcox et al. 


meet the most important requirements, recog- 
nizing trade-offs and contingency approaches. 
Additional considerations are how the soft- 
ware will support long-term needs, not just 
the immediate requirements that have been 
identified. Clinicians and clinical workflow 
analysts are often the primary participants in 
the requirements analysis stage, whereas 
informaticians are more prominent in the 
design phase. This is because during this latter 
phase the clinical goals and strategies are con- 
sidered together with what can be vastly dif- 
ferent design approaches, and the ability to 
consider the various strengths and weaknesses 
of these different approaches is critical. Often, 
design considerations are between custom 
development, purchasing niche applications, 
or purchasing components of a monolithic 
EHR. The considerations of development 
versus COTS software is discussed in more 
detail in the Acquisition Strategy section 
(> 6.3.3.1) below. 


6.3.1.3 Development 

Coding of the software is done during the 
Development phase of the SDLC. The soft- 
ware engineers use the requirements and sys- 
tem designs as they program the code. 
Analysts help resolve questions about require- 
ments and designs for the programmers when 
it is unclear how software might address a 
particular feature. The software process 
defines the pace and granularity of the devel- 
opment. In some cases, an entire software 
component or system is developed at once by 
the team. In other cases, the software is bro- 
ken down into logical pieces and the program- 
mers only work on the features that are 
relevant to the piece they are currently work- 
ing on. As software components are com- 
pleted, unit tests are run to confirm the 
component is free of known bugs and pro- 
duces expected outputs or results. 

In health care, development includes cod- 
ing of custom software as well as configura- 
tion of COTS software. Health care practices 
across institutions (and even within larger 
organizations) are so variable that all software 
requires some — often substantial — configura- 
tion. Configuration can range from assigning 
local values to generic variables within the 


software, to complete development of docu- 
mentation templates, order sets, clinical deci- 
sion support rule, reports, and so on. In fact, 
configuration can be so considerable that 
institutions may use an internal brand name 
for the software and configuration project 
that is different from the name of the COTS 
software, which represents their local configu- 
ration. This configuration is often done using 
tools built specifically for the commercial soft- 
ware, which facilitate the integration of the 
configuration products into the software 
infrastructure. The tools can be complex, 
requiring significant training for developers. 
Typically, tools work well for basic configura- 
tion and may also have advanced functional- 
ity that can be used to configure more 
complicated functionality. The most intensive 
time investment for configuration is typically 
when the tools do not directly support certain 
configurations, and developers must find 
approaches to creatively adapt the develop- 
ment “around the tools.” 


6.3.1.4 Integration and Test 


For complex software projects consisting of 
several components and/or interfaces with 
outside systems, an Integration phase in the 
SDLC is employed to tie together the various 
pieces. Some aspects of the software integra- 
tion are likely done during the Development 
phase by simulating or mocking the outputs 
to, and inputs from, other systems. During 
Integration, these connections are finalized. 
Simulations are run to demonstrate functional 
integration of the various system components. 
Once the various components are integrated, 
a thorough testing regimen is conducted in 
order to prove the end-to-end operation of 
the entire software system. Specific test sce- 
narios are run with known inputs and expected 
outputs. This is typically done in a safe, non- 
operational environment in order to avoid 
conflicts and unnecessary risk to production 
environments, although some inbound infor- 
mation from live systems may be used to ver- 
ify scenarios that are difficult to simulate. 
Testing and integration in health care are 
similar to other complex environments, in that 
it can be difficult to create a testing environ- 
ment that matches the dynamics of the 
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real-world setting. Generally, testing is done 
around multiple use cases or case studies, 
using data to support the cases. In a produc- 
tion environment, however, there may be data 
and information that do not match the case 
studies, since both people and health care are 
complex. As a result, internally-developed 
applications are often provisionally used in a 
“pilot” phase as part of testing. For COTS 
software, companies may use simulation labo- 
ratories that try to mimic the clinical environ- 
ment, or work with specific health care 
organizations as development and testing 
partners. Later, however, this can lead to chal- 
lenges if data representing the dynamics of 
one organization are not easily transferable, 
and software must be further tested with new 
environments. Software transferability 
between institutions has been demonstrated 
in studies, even for specific applications 
(Hripcsak et al. 1998). Another challenge is 
that with current privacy laws, organizations 
are more reluctant to release data to vendors 
for testing. 


6.3.1.5 Implementation 


Once the software passes integration testing it 
moves to the implementation phase. In this 
phase, the software is installed in the live envi- 
ronment. In preparation for installation, 
server hardware, user devices, network infra- 
structure, changes specific to individual facili- 
ties, etc., may need to be implemented and 
tested as well. In addition, user training will 
be performed in the weeks before the software 
goes live. Any changes to policies and proce- 
dures required by the software will also be 
implemented in the build-up to installation. 
Health care presents interesting consider- 
ations in each phase of the software develop- 
ment cycle, but the challenges have been more 
visible in implementation than any other 
phase. This may be because health IT, while 
intended to facilitate more efficient workflows 
with information, is still disruptive. Disruption 
happens most during implementation, when 
clinicians actually begin using the software, 
and studies have shown that during this time 
clinical productivity does decline (Shekelle 
et al. 2006). If users do not perceive that the 
benefits are sufficient to justify this disrup- 
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tion, or if the efficiency does not improve 
quickly enough after the initial implementa- 
tion, they may choose to disregard the soft- 
ware or even revolt against its implementation. 
There have been prominent examples in bio- 
medical informatics of software implementa- 
tions failing during implementation (Bates 
2006; Smelcer et al. 2009; Sullivan 2017), and 
even studies demonstrating harm (Han et al. 
2005). Because of these risks, health IT pro- 
fessionals need to be flexible in implementa- 
tion, and adapt the implementation strategies 
to how the system is adopted. Users have been 
shown to use health IT software in different 
ways for different benefits, and may need 
incentives or prodding to advance to different 
levels of use. 


6.3.1.6 Verification and Validation 

To ensure that the software satisfies the origi- 
nal requirements for the system and meets the 
need of the organization, a formal verification 
and validation of the software is performed. 
The implementing organization verifies that 
the software has the features and performs all 
the functions specified in the requirements 
document. The software is also validated to 
show that it performs according to specified 
operational requirements, that it produces 
valid outputs, and that it can be operated in a 
safe manner. For purchased software, the ver- 
ification and validation phase is used by the 
purchasing organization in order to officially 
accept the software. 

Since clinicians often use software at dif- 
ferent levels or in different ways, tracking 
patterns of use can be an important approach 
for verification and validation of software in 
health care. Additionally, because they have 
experience working in complicated environ- 
ments, users can be good at identifying incon- 
sistencies in data or software functions. Two 
approaches that have been used and can be 
successful for validation are monitoring use, 
and facilitating user feedback. 


6.3.1.7 Operations and Maintenance 


Software eventually enters an operations and 
maintenance (O&M) phase where it is being 
regularly used to support the operational 
needs of the organization. During this phase, 
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an O&M team will ensure that the software is 
operating as desired and will be fielding the 
support needs of the users. Updates may need 
to be installed as new versions of the software 
are released. This may require new integration 
and testing, implementation, and verification 
and validation steps. Ongoing training will be 
required for new users and system updates. 
The O&M team may conduct regular security 
reviews of the system and its use. Data reposi- 
tories and software interfaces will be moni- 
tored for proper operation and continued 
information validity. Software bugs and fea- 
ture enhancement requests will be collected. 
These may drive an entire new development 
life cycle as new requirements persuade an 
organization to explore significant upgrades 
to its current software or even an entirely new 
system. 

Maintenance is a demanding task in health 
information software. It involves correcting 
errors; adapting configurations and software 
to growth, new standards, and new regula- 
tions; and linking to other information 
sources. Maintenance tasks can exceed by 
more than double the initial acquisition costs, 
making it a substantial consideration that 
should affect software design. COTS suppliers 
often provide maintenance services for 
15-30% of the purchase price annually, but 
custom development or configuration mainte- 
nance must be supported by the purchasing 
organization. If the software is not main- 
tained, it can quickly become unusable in a 
health care setting. Indeed, optimization of 
COTS EHRs is a central and ongoing focus 
of applied clinical informatics, and this is 
likely to continue for the foreseeable future. 


6.3.1.8 Evaluation 


An important enhancement to the SDLC sug- 
gested by Thompson et al. (1999) is the inclu- 
sion of an evaluation process during each of 
the phases of the life cycle. The evaluation is 
influenced by risk factors that may affect a par- 
ticular SDLC segment. An organization might 
perform formative evaluations during each 
phase, depending on specific needs, in order to 
assess the inputs, processes and resources 
employed during development. During 


Verification and Validation or O&M, a sum- 
mative evaluation may be performed to assess 
the outcome effects, organizational impact, 
and cost-benefit of the software solution. 

Health IT is considered an intervention 
into the health care delivery system, so evalu- 
ations have been done and published as com- 
parative studies in the clinical literature (Bates 
et al. 1998; Campanella et al. 2016; Evans 
et al. 1998; Hunt et al. 1998; Jones et al. 2014). 
These evaluations, and syntheses of multiple 
studies, have identified areas of impact and 
areas where the effect of health IT software is 
inconsistent. Researchers have also noted that 
most of these studies have occurred in institu- 
tions where software was developed internally, 
with disproportionate under-representation 
of COTS software systems in evaluations, 
especially considering that most health care 
institutions use COTS rather than internal 
development (Chaudhry et al. 2006). It is 
hoped that the existing evaluations can be a 
model for software evaluations of COTS, to 
clarify their impact on care. 


6.3.2 Software Development 
Models 


Different software development processes or 
methods can be used in an SDLC. The soft- 
ware development process describes the day- 
to-day methodology followed by the 
development team, while the life cycle 
describes a higher-level view that encompasses 
aspects that take place well before code is ever 
written and after an application is in use. The 
following are two of the most common exam- 
ples of different development processes in 
clinical information systems development. 


6.3.2.1 Waterfall Model 

The Waterfall model of software development 
suggests that each step in the process happens 
sequentially, as shown in @ Fig. 6.1. The term 
“Waterfall” refers to the analogy of water cas- 
cading downward in stages. A central concept 
of the Waterfall methodology is to solidify all 
of the requirements, establish complete func- 
tional specifications, and create the final soft- 
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O Fig. 6.1 


ware design prior to performing programming 
tasks. This concept is referred to as “Big 
Design Up Front,” and reflects the thinking 
that time spent early-on making sure require- 
ments and design are correct saves consider- 
able time and effort later. Steve McConnell, 
an expert in software development, estimated 
that “...a requirements defect that is left unde- 
tected until construction or maintenance will 
cost 50-200 times as much to fix as it would 
have cost to fix at requirements time” 
(McConnell 1996). 

The waterfall model provides a structured, 
linear approach that is easy to understand. 
Application of the model is best suited to soft- 
ware projects with stable requirements that 
can be completely designed in advance. In 
practice, it may not be possible to create a 
complete design for software a priori. 
Requirements and design specifications can 
change even late in the development process. 
Clients may not know exactly what require- 
ments they need before reviewing a working 
prototype. In other cases, software developers 
may identify problems during the implemen- 
tation that necessitate reworking the design or 
modifying the requirements. 


6.3.2.2 Agile Models 

In contrast to the Waterfall model, modern 
software development approaches have 
attempted to provide more flexibility, particu- 
larly in terms of involving the customer 


The Waterfall model of software development 
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Verification 


throughout the process. In 2001, a group of 
software developers published the Manifesto 
for Agile Software Development, which 
emphasizes iterative, incremental develop- 
ment and welcomes changes to software 
requirements even late in the development 
process (Beck et al. 2001). 

Agile development eschews long-term 
planning in favor of short iterations that usu- 
ally last from 1 to 4 weeks. During each itera- 
tion, a small collaborative team (typically 
5-10 people) conducts planning, requirements 
analysis, design, coding, unit testing, and 
acceptance testing activities with direct 
involvement of a customer representative. 
Multiple iterations are required to release a 
product, and larger development efforts 
involve several small teams working toward a 
common goal. The agile method is value- 
driven, meaning that customers set priorities 
at the beginning of each iteration based on 
perceived business value. 

Agile methods emphasize face-to-face 
communication over written documents. 
Frequent communication exposes problems 
as they arise during the development process. 
Typically, a formal meeting is held each morn- 
ing during which team members report to 
each other what they did the previous day, 
what they intend to do today, and what their 
roadblocks are. The brief meeting, sometimes 
called a “stand-up,” “scrum,” or “huddle,” 
usually lasts 5-15 minutes, and includes the 
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development team, customer representatives 
and other stakeholders. A common imple- 
mentation of agile development is Extreme 
Programming. 


6.3.3 Software Engineering 


The software development life cycle can be 
used to actually create the software, and 
understanding it is critical for those develop- 
ing software in biomedical informatics. 
However, as the field has expanded, software 
has matured to the point that it is developed 
by and available from commercial companies, 
so that software development has become less 
of a concern for most of the field. A more 
important consideration in biomedical infor- 
matics has been the strategy of whether to 
develop and how to develop. Software ven- 
dors can spread development costs over mul- 
tiple organizations, rather than one 
organization having to fund the full develop- 
ment, which can make purchasing software 
economically advantageous. On the other 
hand, the core requirements for software con- 
tinue to change, and sometimes organizations 
need specific capabilities that are not met by 
existing vendor software options. In addition 
to software development, informaticians often 
participate in software acquisition, as well as 
in subsequent enhancements to acquired 
software. 


6.3.3.1 Software Acquisition 

In health care information technology appli- 
cations, a significant question is whether to 
develop the software internally or purchase an 
existing system from a vendor. Whether to 
“build vs. buy” is a core decision in planning 
and implementation. 

Considerations for purchasing software 
begin with how the software will be selected. 
Software can be a component of a monolithic 
vendor system, be a secondary application 
sold by the same vendor as the EHR, or be 
“best-of-breed,” meaning the software that 
meets the requirements best, independent of 
its architecture or source. Another consider- 
ation is whether the software needs to inte- 
grate with other applications. Some specialty 


applications require minimal data sharing 
with other software, while other applications 
must be tightly integrated with existing sys- 
tems to achieve a benefit. Two examples are a 
picture archiving and communications system 
(PACS) and a medication reconciliation tool. 
Perhaps the most important requirement for a 
PACS is to provide access to images for a radi- 
ologist, who can then “read” the image and 
document a report which can be transferred 
into the EHR as a static document. On the 
other hand, a medication reconciliation tool 
may need substantial integration with medica- 
tion ordering and administration modules in 
an EHR to support workflows of the care 
team. Another consideration, related to inte- 
gration, is the storage mechanism. A stand- 
alone system will likely have a separate 
database, while an integrated system may be 
able to store and retrieve data using a com- 
mon data repository. User interface deploy- 
ment is also important, and possibilities 
include Web-based clients, thin clients (e.g., 
Citrix), and locally-installed thick-client 
applications. Greater functionality may exist 
with a thick-client application, but Web-based 
and thin-client tools are easier to update and 
distribute to users. Finally, security and pri- 
vacy considerations are critical in health care, 
and can influence both the requirements and 
design of software. Security considerations 
can include whether user authentication is 
shared with other applications, or what data 
access events are audited for identifying 
potential security threats. 

Most healthcare delivery organizations 
today use commercial — as opposed to locally 
developed — EHRs. But in reality, there is still 
a mix between building and buying health 
information technology. As mentioned, orga- 
nizations using commercial systems require 
substantial local configuration that ranges 
from application-specific parameter configu- 
ration to coordinating multiple software 
applications to link together. Even when there 
is a commitment to limit local configuration, 
there may still be separate systems, local con- 
figuration or even development with data 
warehousing and analytics solutions for the 
EHR data. There is no single solution, com- 
mercial or internally-developed, that meets all 
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the health information needs of most health 
care organizations, and many implementa- 
tions involve a mixture of software from mul- 
tiple vendors. While there can be advantages 
to allowing best-of-breed, a current trend 
among organizations is to consolidate as 
much functionality as possible with one ven- 
dor. Another observed trend is for organiza- 
tions that build systems to consider purchasing 
COTS, due to the substantial maintenance 
costs associated with in-house development 
and the increased functionality often available 
with vendor solutions. At present, virtually all 
health care organizations that utilize an EHR 
in the United States use a COTS solution or 
are in the process of migrating to such a 
solution. 

Usually, if vendor software exists, it is 
more cost-efficient to purchase the software 
than build comparable capabilities internally 
for use at a single organization. This is because 
the vendor can spread development costs over 
multiple organizations, rather than one orga- 
nization having to fund the full development. 
In fact, few organizations have the existing 
infrastructure and personnel to consider 
internal development for anything other than 
small applications. However, those few insti- 
tutions with developed health information 
systems are notable for the success of their 
software. So while the costs may be higher for 
internal development, the benefits may also be 
higher. Furthermore, such solutions may be 
potentially licensed to other organizations, 
thereby spreading the cost of development 
across multiple organizations. Still, these 
institutions have invested many years to build 
an infrastructure that makes these benefits 
possible, and it is unlikely that many organiza- 
tions can afford the time and resource invest- 
ment to follow the same model. Even within 
historically internally-developing organiza- 
tions, buying systems that can integrate with 
the existing system is oftentimes more efficient 
than development. An appropriate general 
guide is therefore, “Buy where you can, build 
where you can’t.” 

Once an organization decides to acquire a 
health information system, there are many 
other decisions beyond whether to build or 
buy. In fact, since the costs in time and money 
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are oftentimes prohibitive for internal devel- 
opment, the decision to build is typically the 
easiest decision to make for a large health 
information system such asthe EHR. Coupled 
with the Meaningful Use EHR Incentive 
Program, adoption of at least basic EHRs in 
the United States is very high, exceeding 90% 
(see © Figs. 6.2, 6.3 and 6.4).! 

Once a decision to purchase a commercial 
system is made, the next decision is what sys- 
tem to purchase. There is a wide variation in 
the functionalities between different EHR 
systems, even though certification efforts have 
defined basic functions that each system 
should have (see © Fig. 6.3). Even systems 
with the same certified functions may 
approach the functions so differently that 
some implementations will be incongruent to 
an organization.” Key factors an organization 
should consider when choosing a system 
include (a) the core functionality of the soft- 
ware, including integration with other sys- 
tems, (b) total system cost, (c) the service 
experience of other customers, and (d) the 
system’s certification status. Some organiza- 
tions have performed systematic reviews of 
different commercial software offerings that 
can be a helpful start to identify possible ven- 
dors and understand variations between sys- 
tems. For example, KLAS Research publishes 
periodic assessments of both software func- 
tions and vendor performance that can be 
used to identify potential software products. 
However, since systems are complex, it is 
Important to meet with and discuss experi- 
ences with actual organizations that have used 
the software. This is typically done through 
site visits to existing customer organizations. 
It is also common for organizations to make a 
broad request of vendors for proposals to 
address a specific software need, especially 


1 »> https://v.healthit.gov/quickstats/pages/physician- 
ehr-adoption-trends.php and » https://dashboard. 
healthit.gov/quickstats/pages/FIG-Hospital-EHR- 
Adoption.php, from » https://dashboard.healthit. 
gov/quickstats/quickstats.php 

2 »> https://dashboard.healthit.gov/quickstats/ 
pages/FIG-Vendors-of-EHRs-to-Participating- 
Hospitals.php (last accessed June 3, 2020). 


192 A. B. Wilcox et al. 


100% 


75% ae 


50% 


25% 


2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2017 


ate Any EHR „= Basic EHR =t- Certified EHR 


O Fig.6.2 Office-based physician EHR adoption, from ONC (2019a) 
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© Fig.6.3 Non-federal acute care hospital EHR adoption, from ONC (2017b) 
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when the needs are not standard components 
of EHR software. 

After a commercial product is selected, an 
organization must then choose how extensive 
the software will be. EHR companies typically 
have a core EHR system, with additional 
modules that have either been developed or 
acquired and integrated into their system. The 
set of modules used by each institution varies. 
One organization may use the core EHR sys- 
tem and accompanying modules for certain 
specialties, such as internal medicine and fam- 
ily practice, while choosing to purchase sepa- 
rate best-of breed software for other 
specialties, like obstetrics/gynecology and 
emergency medicine, even when the core EHR 
vendor has functional modules for those 
areas. Another organization may choose to 
purchase and implement all specialty systems 
offered from the core EHR vendor, and only 
purchase other software if a similar module is 
not available from the vendor. These decisions 
also must be made for all ancillary systems, 
including laboratory, pharmacy, radiology, 
etc. This is both a pre-implementation 
decision and a long-term strategy. Once the 
EHR is implemented, many specialties that 
were not included in the initial implementa- 
tion plan may request software and data inte- 
gration, depending on the success of the EHR 
implementation. 

For organizations that choose components 
of multiple vendor offerings to any degree, 
they will need to address how to integrate the 
components together to minimize disruption 
to the users’ workflow. There are various strat- 
egies that can be pursued to integrate mod- 
ules, either at the level of user context (user 
authentication credentials are maintained), 
the level of the application view (one applica- 
tion is viewable as a component within 
another application), or at the level of data 
sharing (data are exchanged between the 
applications). If components are not inte- 
grated, a user must access each application 
separately, by opening the software applica- 
tion, logging in to each separately, and select- 
ing the patient within each. When data are 
integrated at the user context, a user moves 
between both applications, but the user and 
patient context are shared. This “single sign- 
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on” approach alleviates one of the main barri- 
ers to the user, by facilitating the login and 
patient selection, while retaining all the func- 
tionality of each system. 

A deeper level of integration is at the 
application view. In this case, a primary appli- 
cation provides a portal that views another 
application; the second application shares 
user and patient context and is accessible 
through the user’s main workflow system. The 
second application may use data from the pri- 
mary application and/or other data sources. 
Rapidly expanding in adoption for this type 
of integration is the HL7 Substitutable 
Medical Applications and Reusable 
Technologies (SMART) framework, in par- 
ticular in combination with HL7 Fast 
Healthcare Interoperability Resources 
(FHIR) for data interchange. This approach, 
known as SMART on FHIR, is enabling an 
ecosystem wherein applications developed by 
health care organizations or third-party ven- 
dors can be seamlessly integrated within the 
EHR (Kawamoto et al. 2019; Mandel et al. 
2016). 

The deepest level of integration is where | 
data elements from one system are also stored 
in the other system. With this approach, one 
system is determined to be the main reposi- 
tory, and data from the other systems are 
automatically stored into the repository. This 
approach has the advantage of the most com- 
plete use of data, e.g., decision support logic 
can use data from multiple systems, which can 
be more accurate. The disadvantage is that the 
integration can be expensive, requiring new 
interfaces for each integrated system. 

Another and often overlooked consider- 
ation of EHR software modules is data ana- 
lytics capabilities, usually discussed in 
conjunction with a data warehouse. EHR sys- 
tems generally include reporting functional- 
ity, where specific reports can be configured to 
summarize and display data stored in the sys- 
tem. However, these systems often do not 
facilitate ad hoc data extracts that are com- 
monly needed for more complicated data 
analysis. Additionally, if modules from multi- 
ple software vendors are used, the data report- 
ing functions will be limited according to how 
well data are integrated. One typical approach 
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is to use a separate data warehouse and analy- 
sis system, with functions to create ad hoc 
reports, that can combine data from multiple 
systems. Data integration with warehouses is 
less expensive than with repositories, because 
the data do not need to be synchronized. 
Instead, data can be extracted in batches from 
source systems, transformed to the warehouse 
data model, and then loaded into the ware- 
house at periodic intervals. The greatest cost 
of integration is the data transformation, but 
this transformation is similar to what is 
required when receiving data through a real- 
time interface. 

The Meaningful Use program, which has 
evolved to become the Promoting 
Interoperability program, has greatly influ- 
enced the systems that are installed by an 
institution. Initially, the ONC created a list of 
important EHR functions. They also created 
a requirement that hospitals and physician 
practices use a “certified” system — i.e., one 
that has demonstrated it provides those func- 
tions — to receive the incentives, and other cri- 
teria that the functions must be used in clinical 
care (Washington et al. 2017). As a result, 
health care organizations rapidly imple- 
mented EHRs that were certified and best ful- 
filled regulatory requirements. 


6.3.3.2 Case Studies of EHR Adoption 


Consider the following case studies of institu- 
tions adopting EHR systems. All examples 
are fictional, but reflect the complexity of the 
issues with EHR software. 

Best-Care Medical Center had been using 
information systems for many years, dating 
back to when some researchers in the cardiology 
department built a small system to integrate 
data from the purchased laboratory and phar- 
macy information systems. Eventually, the 
infection control group for the hospital began 
using the system, and contributed efforts to 
expand its functionality. Other departments 
began developing decision support rules, and the 
system continued to grow. Eventually, the insti- 
tution made a commitment to redevelop the 
infrastructure to support a much larger group 
of users and functions, and named it A-Chart. 
Satisfaction with the system was high where it 
had been initially developed, and with other 


related specialties. However, over time there 
was disproportionate development in these 
areas, and clinicians in other specialties com- 
plained about the rudimentary functionality, 
especially when compared to existing vendor 
systems for their specialty. As a result, the 
organization decided to purchase a new vendor 
system. This made the other specialties happy, 
but was a big concern to the groups that had 
been using A-Chart for years. These clinicians 
feared that they would have to reconfigure their 
complicated decision support rules with a new 
system, or worse, that functionality would no 
longer be supported. To alleviate concerns, rep- 
resentatives from each department were asked 
to participate in both drafting a Request for 
Proposals and then reviewing the proposals 
from four different vendors. Many clinicians 
liked System X, but in the end the hospital 
chose System Y, which seemed to have most of 
the same capabilities but was perceived to be 
more affordable than System X. However, 
System Y did not include a laboratory system, 
so the medical center purchased a separate lab- 
oratory system and built interfaces to connect it 
with the core EHR. 

Patients’ Choice is an integrated delivery 
system with a long history of EHR use.. Years 
ago, it existed as two separate systems of hospi- 
tals and clinics. Shortly before the merger of 
these systems, the hospitals and clinics pur- 
chased separate EHRs, InPatSys and CliniCare. 
At the time of the merger, the institution felt 
that each would be best served by a best-of- 
breed system, to support the different work- 
flows, and there was no single system that both 
sides of the organization could agree to use. 
Years later, as Patients’ Choice began to inte- 
grate care between the hospitals and clinics, the 
clinicians and administrators became increas- 
ingly frustrated at how different the InPatSys 
and CliniCare systems were, and that they had 
to use two separate systems to care for the same 
patients. A team was formed to evaluate the 
options, and the CliniCare system was eventu- 
ally replaced by OutPatSys, the outpatient ver- 
sion of InPatSys. To prevent losing data as they 
moved from one system to the other, the 
Patients’ Choice IT department prepared the 
OutPatSys system by loading existing labora- 
tory results and vital sign measurements from 
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CliniCare. A SMART on FHIR application 
provided an integrated view of historical 
CliniCare data inside the Out PatSys system. 

Hometown Community Hospital histori- 
cally used various niche information systems, 
but no EHR. With the availability of 
Meaningful Use incentives, the hospital decided 
to acquire a commercial EHR. A leadership 
team visited six hospitals to investigate how 
various EHRs were used. Ultimately, the hospi- 
tal made the decision to purchase eCompu- 
Chart, because it was highly rated and seemed 
best adapted to their needs. Hometown hired a 
new Chief Information Officer who had recently 
implemented eCompuChart at a community 
hospital in a neighboring state. They also pro- 
moted Dr. Jones, who had recently moved from 
another hospital that had also used eCompu- 
Chart, to Chief Medical Information Officer 
(CMIO). The CIO and CMIO negotiated a 
contract with DigiHealth, a consulting com- 
pany with experience in implementing EHRs, 
to plan and coordinate the implementation. 
Among other recommendations from 
DigiHealth, most existing systems were 
replaced with modules from eCompuChart to 
simplify maintenance. 

In practice, organizations may not adopta 
complete “build” or a complete “buy” strat- 
egy. EHR vendors have advanced consider- 
ably in their ability to create systems that meet 
common needs in health care. Still, no system 
exists to date that can fully address all infor- 
mation needs for an organization, in part 
because the information needs expand as 
more data and new technologies become 
available. Additionally, EHR _ strategies 
become malleable over time, as commercial 
software capabilities increase and data become 
more consistent. As indicated through some 
of the examples above, organizational strate- 
gies may change over time to adapt to these 
capabilities and needs. Expanding options for 
health care organizations is the emergence of 
the notion of the EHR as a platform, where 
HL7 FHIR data interfaces can be used to 
read data from, and in some cases write data 
back to the EHR; HL7 SMART is available to 
integrate external applications into the native 
EHR user interface; and more recently, a 
framework known as HL7 CDS Hooks is 
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available to interface externally developed 
alerts and reminders into the EHR.* 

One consideration that is not always stated 
in the software selection process, but is signifi- 
cant in its influence over the decision, is how 
an organization will pay for the application. 
In organizations where software purchases are 
requested from the information technology 
department and budget, overall maintenance 
costs are considered more prominently, and 
software that integrates with and is a compo- 
nent of the overall EHR vendor offering is 
often selected. However, if a clinical depart- 
ment has direct control over their spending for 
the software, functionality may become a 
more focal concern. An additional case study 
illustrates this situation. 

Downtown Hospital recently decided to pur- 
chase eCompuChart as the centerpiece of its 
overall clinical information system strategy. 
eCompuChart has award-winning modules for 
the emergency department and intensive care 
units. However, there are strong complaints 
about its capabilities for labor and delivery 
management and radiology. After considering 
capabilities of best-of-breed options and their 
ability to integrate with eCompuChart, 
Downtown Hospital eventually made a split 
decision. The labor and delivery module for 
eCompuChart was purchased because other 
systems with more elaborate functionality could 
not integrate data as well with the overall 
EHR. On the other hand, a separate best-of- 
breed system was purchased for radiology, 
because interfaces between the systems were 
seen as an acceptable solution for integrating 
data. 


6.3.3.3 Enhancing Acquired Software 
Although most institutions will choose to 
acquire a system rather than building it from 
scratch, software engineering is still required 
to make the systems function effectively. This 
involves more than just installing and config- 
uring the software to the local environment. 
There is still a significant need for software 
development in implementing COTS, because 


3  » https://cds-hooks.org/ (last accessed June 3, 
2020). 
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(1) applications must be integrated with exist- 
ing systems, and (2) leading healthcare institu- 
tions are increasingly developing custom 
applications, such as SMART on FHIR appli- 
cations, that supplement commercial systems. 


6.3.3.4 Integration with Existing 
Systems 

In all but the most basic health care informa- 

tion technology environments, multiple soft- 

ware applications are used for treatment, 
payment, and operations purposes. A partial 

list of applications that might be used in a 

hospital environment is shown in @ Table 6.1. 
To facilitate the sharing of information 

among various software applications, standards 

have emerged for exchanging messages and 

defining clinical terminology (see >» Chap. 8). 

Message exchange between different software 

applications enables the following scenario: 

1. A patient is admitted to the hospital. A 
registration clerk uses the bed manage- 
ment system to assign the patient’s loca- 
tion and attending physician of record. 


O Table 6.1 Partial list of software 
applications that may be used in a hospital 
setting 
System Primary Users 
Inpatient EHR (Results Physicians, nurses, 
Review, Order Entry, allied health 
Documentation) professionals 
Pharmacy Information Pharmacists, 
System pharmacy 
technicians 
Laboratory Information Laboratory 
System technicians, 
phlebotomists 
Radiology Information Radiologists, 
System radiology 
technicians 
Pathology Information Pathologists 


System 


Registration/Bed 
Management 


Hospital Billing System 


Professional Services 
Billing System 


Registration staff 


Medical coders 


Physicians, medical 
coders 


2. The physician orders a set of routine blood 
tests for the patient in the inpatient EHR 
computerized order entry module. 

3. The request for blood work is sent elec- 
tronically to the laboratory information 
system, where the blood specimen is 
matched to the patient using a barcode. 

4. The results of the laboratory tests are sent 
to the results review module of the EHR 

Message exchange is an effective means 
of integrating disparate software applica- 
tions in healthcare when the users rely pri- 
marily on a single “workflow system” (e.g., 
a physician uses the inpatient EHR and a 
laboratory technician uses the LIS). 
Because message exchange is handled by a 
sophisticated “interface engine” (see 
> Chap. 8), little software development, in 
the traditional sense, is typically required. 
When a user accesses multiple workflow 
systems to perform a task, message 
exchange may not be sufficient and a 
deeper level of integration may be required. 
For example, consider the following addi- 
tion to the previously described scenario: 

5. The physician reviews the patient’s blood 
work and notes that the patient may be 
suffering from renal insufficiency as evi- 
denced by his elevated creatinine level. 

6. The physician would like to review a trend 
of the patient’s creatinine over the past 
3 years. Because the hospital installed their 
commercial EHR less than a year ago, 
data from prior to that time are available 
in a legacy results review system that was 
developed locally. The physician logs into 
the legacy application (entering her user- 
name and password), searches for the cor- 
rect patient, and reviews the patient’s 
creatinine history. 


While it may seem preferable in this scenario 
to load all data from the legacy system into 
the new EHR, commercial applications may 
not support importing such data for various 
reasons. To simplify and improve the user 
experience for reviewing information from a 
legacy application within a commercial EHR, 
one group of informaticians created the cus- 
tom application shown in @ Fig. 6.5. The 
application is accessed by clicking a link 
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O Fig.6.5 Example screen from a custom lab summary display application integrated into a commercial EHR. The 
application shows a longitudinal view of laboratory results that can span multiple patient encounters 


Encounter Form 
Day Date 
MON 19 Mar 2012 
Billing Provider: 
Performing Provider: 
Referring Provider: 
Compliance Code: 


in 
Division: [GENERAL MEDICINE NEW (MEO jiw 


Billing Area: 
Locaton: 


| MOSPITALISTS - MILSTEIN try 


eds 5 


Diabetes mellitus without mention of complication, type I or unspecified type, not stated as uncontrolled 


Alcohol withdrawal 


Acute diastolic heart failure 


O Fig.6.6 Example screen from a custom billing application integrated into a commercial EHR. This replaced a 
separate application that was not integrated into the clinicians’ workflow 


within the commercial EHR and does not 
require login or patient look-up. 

In an example of a more sophisticated 
level of “workflow integration” is shown in 
O Fig. 6.6. In this example, informaticians 
developed a custom billing application within 


an inpatient commercial EHR. Users of the 
application were part of a physician practice 
that used a different outpatient EHR with a 
professional billing module with which they 
were already familiar. When the physicians in 
the practice rounded on their patients who 
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were admitted to the hospital, they docu- 
mented their work by writing notes within the 
inpatient EHR, and then used their outpa- 
tient EHR to submit their professional service 
charges. This practice not only required a 
separate login to submit a bill, but also 
required duplicate patient lists to be main- 
tained in each application, as well as a dupli- 
cate problem list for each patient to be 
managed in each application. The integrated 
charge application was accessed from the 
inpatient EHR but provided the same look- 
and-feel as the outpatient EHR billing mod- 
ule. Charges were submitted through the 
outpatient EHR infrastructure and would 
appear as normal charges in the outpatient 
system, with the substantial improvement of 
displaying the information (note name, 
author, and time) for the documentation that 
supported the charge. 


6.3.3.5 Development of Custom 
Applications that Supplement 
or Enhance Commercial 
Systems 
Commercial EHRs frequently provide cus- 
tomers with the ability to develop custom 
software modules. Some EHRs provide a flex- 
ible clinical decision support infrastructure 
that allows customers to develop modules that 
execute medical logic to generate alerts, 
reminders, corollary orders, and so on. 
Vendors may also provide customers with 
tools to access the EHR database, which 
allows development of stand-alone applica- 
tions that make use of EHR data. Additionally, 
vendors may foster development of custom 
user interfaces within the EHR by providing 
an application programming interface 
through which developers can obtain infor- 
mation on user and patient context. 

The ability to provide patient-specific clin- 
ical decision support is one of the key benefits 
of EHRs. Many commercial EHRs either 
directly support or have been influenced by 
the Arden Syntax for Medical Logic Modules 
(Pryor and Hripcsak 1993). The Arden Syntax 
is part of the HL7 family of standards. It 
encodes medical knowledge as Medical Logic 


Modules (MLMs), which can be triggered by 
various events within the EHR (e.g., the plac- 
ing of a medication order) and execute serially 
as a sequence of instructions to access and 
manipulate data and generate output. MLMs 
have been used to generate clinical alerts and 
reminders, to screen for eligibility in clinical 
research studies, to perform quality assurance 
functions, and to provide administrative sup- 
port (Dupuits 1994; Jenders 2008; Jenders and 
Shah 2001; Ohno-Machado et al. 1999). 
Although one goal of the Arden Syntax was 
to make knowledge portable, MLMs devel- 
oped for one environment are not easily trans- 
ferable to another. Developers of clinical 
decision support logic require skills in both 
computer programming as well as medical 
knowledge representation. 

An example of a standalone, locally devel- 
oped software application that relies on EHR 
data is shown in @ Fig. 6.7. The Web-based 
application, EpiPortal™, provides a compre- 
hensive, electronic hospital epidemiology 
decision support system. The application can 
be accessed from a Web browser or directly 
from within the EHR. It relies on EHR data 
such as microbiology results, clinician orders, 
and bed tracking information to provide users 
with timely information related to infection 
control and prevention. 

In some cases, it is desirable to develop 
custom applications to address specific clini- 
cal needs that are not met by a commercial 
EHR. For example, most commercial EHRs 
lack dedicated tools to support patient hand- 
off activities. For hospitalized patients, hand- 
offs between providers affect continuity of 
care and increase the risk of medical errors. 
Informaticians at one academic medical cen- 
ter developed a collaborative application sup- 
porting patient handoff that is fully integrated 
with a commercial EHR (Fred et al. 2009). An 
example screen from the application is shown 
in Ø Fig. 6.8. The application creates user- 
customizable printed reports with automatic 
inclusion of patient allergies, active medica- 
tions, 24-hour vital signs, recent common lab- 
oratory test results, isolation requirements, 
code status, and other EHR data. 
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O Fig. 6.7 Example screen from a standalone, soft- 
ware application that relies on EHR data to provide a 
comprehensive, electronic hospital epidemiology deci- 
sion support system. (Reused with permission. Copy- 


6.4 Emerging Influences 


and Issues 


Several trends in software engineering are 
beginning to significantly influence biomedi- 
cal information systems. While many of the 
trends may not be considered new to software 
engineering in general, they are more novel to 
the biomedical environment because of the 
less rapid and less broad adoption of informa- 
tion technology in this field. One area in par- 
ticular that has received growing attention is 
service oriented architectures (SOA). 
Sometimes called “software as a service”, 
SOA is a software design framework that 
allows specific processing or information 
functions (services) to run on an independent 
computing platform that can be called by sim- 
ple messages from another computer applica- 
tion. For example, an EHR application might 
have native functionality to maintain a 
patient’s medication list, but might call a 
drug-drug interaction program running on a 
third party system to check the patient’s medi- 
cations for potential interactions. This allows 
the EHR provider to off-load developing this 
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right 2012 The New York and Presbyterian Hospital 
and Columbia University — All rights reserved. EpiPor- 
tal is a trademark of The New York and Presbyterian 
Hospital.) 


functionality, while the drug-drug interaction 
service provider can concentrate efforts on 
this focused task, and in particular on ensur- 
ing that the drug interaction database is kept 
up-to-date for all users of the service. Since 
the service is independent of any EHR appli- 
cation, many different EHR providers can call 
the same service, as can other applications 
such as patient health record (PHR) applica- 
tions that are focused on consumer function- 
ality. SOA might also be grouped with the 
more recently computer phrase “cloud com- 
puting”, which includes providing functional 
services to other applications, but also encom- 
passes running entire applications and storing 
data in offsite or disconnected locations. A 
good example of SOA is the HL7 CDS Hooks 
standard, which specifies how EHR systems 
can interface with external clinical decision 
support services to provide point-of-care 
alerts and reminders to clinical end users. 
Another emerging trend, discussed earlier, 
is the notion of the EHR as a platform for 
third-party applications and services that 
interface with, and add value to, the 
EHR. Central to this approach is HL7 FHIR 
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O Fig. 6.8 Example screen from a custom patient 
handoff application integrated into a commercial 
EHR. The application creates user-customizable printed 


application programming interface (API), 
which uses modern Internet technologies and 
approaches for data exchange, as well as HL7 
SMART for application integration and HL7 
CDS Hooks for integrating decision support 
services. While this notion of EHR as a plat- 
form is still in its early stages and still matur- 
ing, many EHR vendors are strongly 
supportive of this type of an ecosystem, and 
promising examples are emerging of how 
these technologies can be used to deliver value 
to health care organizations in an EHR- 
agnostic manner. We anticipate that this 
approach to health information systems, 
wherein core EHR systems are augmented by 
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RR: 19 (16 - 31) | SpO2: 100% (96 - 100) 


reports with automatic inclusion of patient allergies, 
active medications, 24-hour vital signs, recent common 
laboratory test results, isolation requirements, code sta- 
tus, and other EHR data 


third-party applications and services, will play 
an important role in the health IT ecosystem 
in the years to come. 

Another important consideration in clini- 
cal information systems is infrastructure to 
support data sharing, such as through a health 
information exchange (HIE). HIE infrastruc- 
ture allows organizations to share informa- 
tion about patients through a common 
electronic framework. Robust HIE capabili- 
ties, which are now being implemented in 
commercial EHR systems, make it much more 
efficient to share patient information between 
organizations versus creating point-to-point 
interfaces between all the clinical information 
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systems a particular provider might need to 
communicate with. Effective sharing of infor- 
mation is predicated on the use standard mes- 
sage formats and terminologies (see > Chap. 
8), or the use of a shared EHR vendor. Where 
HIE functionality does not exist or is declin- 
ing (Adler-Milstein et al. 2016), sharing can 
be coordinated directly between organizations 
with incentives for sharing, such as in an 
ambulatory care network (ACN). As health 
care in the United States increasingly shifts to 
a payment model that rewards value over vol- 
ume, data sharing capabilities and the capac- 
ity for population-level analytics and health 
management will become increasingly critical. 
Moreover, there are also emerging efforts to 
scale health information exchange to a 
national scale, and to facilitate patient access 
to their information using APIs (ONC 2019b). 

Software engineering is an ever-evolving 
discipline, and new ideas are emerging rap- 
idly in this field. It is less than 30 years since 
the first graphical browser was used to access 
the World Wide Web, but today Web-based 
applications are the standard. Access to 
information through search engines has 
changed the way that people find and evalu- 
ate information. Social networking applica- 
tions have altered our views on privacy and 
personal interaction. All of these develop- 
ments have shaped the development of 
healthcare software, too. Today it is unimagi- 
nable that an EHR would not support a 
Web-based patient portal. Clinicians and 
consumers use the Web to search for health- 
related information in growing numbers and 
with growing expectations. It is not atypical 
for patients to discuss health issues in online 
forums and share intimate details on patient 
networking sites. 

Another development that is impacting 
virtually all industries, including health 
care, is advanced analytics. Coupled with 
the rapid increase in the adoption of EHR 
systems, health care represents a golden 
opportunity for leveraging powerful com- 
puting approaches with large data sets to 
identify and apply new insights. For exam- 
ple, deep learning techniques can be applied 
to EHR data to predict important outcomes 
such as in-hospital mortality. If coupled 
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with user-centered, workflow-integrated 
interventions, such insights have the poten- 
tial to improve clinical decision making and 
enhance patient care. 


6.5 Summary 

The goal of software engineering in health 
care is to create a system that facilitates deliv- 
ery of care. Much has changed in the past 
decade with EHRs, and today most institu- 
tions will purchase rather than build an 
EHR. But engineering these systems to facili- 
tate care is still challenging, and following 
appropriate software development practices is 
increasingly important. The success of a sys- 
tem depends on interaction among designers 
of healthcare software applications and those 
that use the systems. Communication among 
the participants is very difficult when it comes 
to commercial applications. Informaticians 
have an important role to play in bridging the 
gaps among designers and users that result 
from the wide variety in background, educa- 
tion, experience, and styles of interaction. 
They can improve the process of software 
development by specifying accurately and real- 
istically the need for a system and of designing 
workable solutions to satisfy those needs. 


© Suggested Reading 

Carter, J. H. (2008). Electronic health records (2nd 
ed.). Philadelphia: ACP Press. Written by a 
clinician and for clinicians, this is a practical 
guide for the planning, selection, and imple- 
mentation of an electronic health record. It 
first describes the basic infrastructure of an 
EHR, and then how they can be used effec- 
tively in health care. The second half of the 
book is written more as a workbook for some- 
one participating in the selection and imple- 
mentation of an EHR. 

KLAS Reports. http://www.klasresearch.com/reports. 
These reports are necessary tools for a project 
manager who needs to know the latest indus- 
try and customer information about vendor 
health information technology products. The 
reports include information on functionality 
available from vendors as well as customer 
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opinions about how vendors are meeting the 
needs of organizations and whose products 
are the best in a particular user environment. 


McConnell, S. (1996). Rapid development: Taming 


wild software schedules. Redmond: Microsoft 
Press. For those who would like a deeper under- 
standing of software development and project 
methodologies like Agile, this is an excellent 
source. It is targeted to code developers, system 
architects, and project managers. 


President’s Council of Advisors on Science and 


Technology (December 2010). Report to the 
President Realizing the Full Potential of 
Health Information Technology to Improve 
Healthcare for Americans: the Path Forward. 
http://www. whitehouse. gov/sites/default/files/micro- 
sites/ostp/pcast-health-it-report.pdf. This PCAST 
report focuses on what changes could be made 
in the field of electronic health records to make 
them more useful and transformational in the 
future. It gives a good summary of the current 
state of EHRs in general, and compares the 
barriers to those faced in adopting information 
technology in other fields. Time will tell if the 
suggestions really become the solution. 


Stead, W. W., & Lin, H. S. (Eds.). (2009). 


Computational technology for effective health 
care: immediate steps and strategic directions. 
Washington, DC: National Academies Press. 
This is a recent National Research Council 
report about the current state of health infor- 
mation technology and the vision of the 
Institute of Medicine about how such technol- 
ogy could be used. It can help give a good 
understanding of how health IT could be used 
in health care, especially to technology profes- 
sionals without a health care background. 


Tang, P. C. (2003). Key capabilities of an electronic 


health record system. Washington, DC: 
National Academies Press. This is a short, let- 
ter report from an Institute of Medicine com- 
mittee that briefly describes the core 
functionalities of an electronic health record 
system. Much of the report is tables that list 
specific capabilities of EHRs in some core 
functional areas, and indicate their maturity in 
hospitals, ambulatory care, nursing homes, 
and personal health records. 


Wager, K. A., Lee, F. W., & Glaser, J. P. (2017). 


Health care information systems: a practical 
approach for health care management. John 


Wiley & Sons. This is a textbook giving a good 
overview of healthcare information systems, 
used in many academic courses on the subject. 
It reviews the different environmental factors 
and contexts that influence the health infor- 
mation landscape nationally, as well as giving 
guidance on implementation, management 
and evaluation of systems. 


® Questions for Discussion 


1. Reread the hypothetical case study in 
> Sect. 6.2.1. 

(a) What are three primary benefits of 
the software used in James’s care? 

(b) How many different ways is James’s 
information used to help manage 
his care? 

(c) Without the software and informa- 
tion, how might his care be different? 

(d) How has health care that you have 
experienced similar or different to 
this example? 

2. For what types of software 
development projects would an agile 
development approach be better than a 
waterfall approach? For what types of 
development would waterfall be 
preferred? 

3. What are reasons an institution would 
choose to develop software instead of 
purchase it from a vendor? 

4. How is would various stages in the soft- 
ware development life cycle be different 
when developing software versus config- 
uring or adding enhancements to an 
existing software program? 

5. Reread the case studies in » Sect. 
032: 

(a) What are the benefits and 
advantages of the different 
approaches to development and 
acquisition among the scenarios? 

(b) What were the initial costs for 
each institution for the software? 
Where will most of the long-term 
costs be? 

6. In what ways might new trends in 
software (small “apps” that accomplish 
focused tasks) change long-term 
strategies for electronic health record 
architectures? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= Why are standards important in 
biomedical informatics? 

= What data standards are necessary to 
be able to exchange data seamlessly 
among systems? 

= What organizations are 
standards development? 

= What aspects of biomedical information 
management are supported today by 
standards? 

= What is the process 
consensus standards? 

= What factors and organizations 
influence the creation of standards? 

= How have standards development 
organizations applied modern Internet 
technologies such as application 
programing interfaces (APIs) into their 
interoperability standards? 


active in 


for creating 


7.1 The Idea of Standards 


Ever since Eli Whitney developed interchange- 
able parts for rifle assembly, standards have 
been created and used to make things or pro- 
cesses work more easily and economically—or, 
sometimes, to work at all. A standard can be 
defined in many physical forms, but essentially 
it comprises a set of rules and definitions that 
specify how to carry out a process or produce 
a product. Sometimes, a standard is useful 
because it provides a way to solve a problem 
that other people can use without having to 
start from scratch. Generally, though, a stan- 
dard is useful because it permits two or more 
disassociated people to work in some cooper- 
ative way. Every time you screw in a light bulb 
or play a media file, you are taking advantage 
of a standard. Some standards make things 
work more easily. Some standards evolve over 
time,! others are developed deliberately. 


1 The current standard for railroad-track gauge origi- 
nated with Roman chariot builders, who set the axle 
length based on the width of two horses. This axle 
length became a standard as road ruts developed, 


The first computers were built without 
standards, but hardware and software stan- 
dards quickly became a necessity. Although 
computers work with values such as | or 0, 
and with “words” such as 10101100, humans 
need a more readable language (see > Chap. 
5). Thus, standard character sets, such as 
ASCII and EBCDIC, were developed. The 
first standard computer language, COBOL, 
was written originally to simplify program 
development but was soon adopted as a way 
to allow sharing of code and development 
of software components that could be inte- 
grated. As a result, COBOL was given official 
standard status by the American National 
Standards Institute (ANSI).? In like manner, 
hardware components depend on standards 
for exchanging information to make them 
as interchangeable as were Whitney’s gun 
barrels. 

A 1987 technical report from the 
International Standards Organization (ISO) 
states that “any meaningful exchange of 
utterances depends upon the prior exis- 
tence of an agreed upon set of semantic and 
syntactic rules” (International Standards 
Organization 1987). In biomedical infor- 
matics, where the emphasis is on collection, 
manipulation, and transmission of informa- 
tion, standards are essential and their impor- 
tance is widely recognized by clinicians and 
policy makers. Requirements for implemen- 
tation of interoperability standards have 
been written into laws and regulations. Over 
the past 2 years, the bipartisan 21st Century 
Cures Act? (Hudson and Collins 2017) has 
codified many standards into everyday use. 
At present, the standards scene is evolving 
so rapidly that any description is inevitably 
outdated within a few months. This chapter 


requiring that the wheels of chariots—and all subse- 
quent carriages—be the right distance apart to drive 
in the ruts. When carriage makers were called on to 
develop railway rolling stock, they continued to use 
the same axle standard. 

2 Interestingly, medical informaticians were responsi- 
ble for the second ANSI standard language: 
MUMPS (now known as M). 

3 > https://en.wikipedia.org/wiki/21st_Century_ 
Cures_Act (accessed 12/2/19) 
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emphasizes the need for standards in general, 
standards development processes, current 
active areas of standards development, and 
key participating organizations that are mak- 
ing progress in the development of usable 
standards. 


The Need for Health 
Informatics Standards 


7.2 


Standards are generally required when 
excessive diversity creates inefficiencies or 
impedes effectiveness. The healthcare envi- 
ronment has traditionally consisted of a set 
of loosely connected, organizationally inde- 
pendent units. Patients receive care across 
primary, secondary, and tertiary care set- 
tings, with little bidirectional communica- 
tion and coordination among the services. 
Patients are cared for by one or more pri- 
mary physicians, as well as by specialists. 
There is little coordination and sharing 
of data between inpatient care and outpa- 
tient care. Both the system and patients, by 
choice, create this diversity in care. Within 
the inpatient setting, the clinical environ- 
ment is divided into clinical specialties that 
frequently treat the patient without regard 
to what other specialties have done. 

Ancillary departments function as 
detached units, performing their tasks as sep- 
arate service units, reporting results without 
follow-up about how those results are used 
or whether they are even seen by the ordering 
physician. Reimbursement requires patient 
information that is often derived through a 
totally separate process, based on the frag- 
mented data collected in the patient’s medical 
record and abstracted specifically for billing 
purposes. The resulting set of diagnosis and 
procedure codes often correlates poorly with 
the patient’s original information (Jollis et al. 
1993). With the transition of the US health- 
care system from a fee for service payment 
model to a value-based care model, the need 
to share information between healthcare pro- 
viders and their IT systems in order to coordi- 
nate care becomes even more crucial. 
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7.2.1 Early Standards to Support 
the Use of IT in Health Care 


Early interest in the development of stan- 
dards was driven by the need to exchange 
data between clinical laboratories and clini- 
cal systems, and then between independent 
units within a hospital. Therefore, the first 
standards were for data exchange and were 
referred to as messaging standards. Early 
systems were developed within independent 
service units, functional applications such as 
ADT (admission-discharge-transfer) and bill- 
ing, and within primary care and specialty 
units. The first uses of computers in hospitals 
were for billing and accounting purposes and 
were developed on large, monolithic main- 
frame computers Initially the cost of comput- 
ers restricted expansion into clinical areas. 
But in the 1960s, hospital information sys- 
tems (HISs) were developed to support service 
operations within a hospital. These systems 
followed a pattern of diversity similar to that 
seen in the health care system itself. 

As new functions were added in the 1970s, 
they were implemented on mainframe com- 
puters and were managed by a data processing 
staff that usually was independent of the clin- 
ical and even of the administrative staff. The 
advent of the minicomputer supported the 
development of departmental systems, such 
as those for the clinical laboratory, radiology 
department, or pharmacy. With the advent of 
minicomputers, departmental systems were 
introduced but connectivity to other parts 
of the hospital was either by paper or inde- 
pendent electronic systems. It was common 
to see two terminals sitting side-by-side with 
an operator typing data from one system to 
another. Clinical systems, as they developed, 
continued to focus on dedicated departmental 
operations and clinical-specialty systems and 
thus did not permit the practicing physician to 
see a unified view of the patient. Most HISs 
were either supported entirely by a single ven- 
dor or were still functionally independent and 
unconnected. 

In the 1980s, the need to move laboratory 
data directly into developing electronic health 
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records systems (although this term was not 
used then), early standards were created in 
ASTM (formerly, the American Society for 
Testing and Materials, see >» Sect. 8.3.2) for 
the transfer of laboratory data from local and 
commercial laboratories (American Society for 
Testing and Materials 1999). In the late 1980s, 
Simborg and others developed an HIS by 
interfacing independent systems using a “Best 
of Breed” approach to create an integrated 
HIS (Simborg et al. 1983) Unfortunately, the 
cost of developing and maintaining those 
interfaces was prohibitive, and the need for 
a broader set of standards was realized. This 
effort resulted in the creation of the standards 
developing organization (SDO) Health Level 
Seven (HL7) in 1987. Other SDOs were cre- 
ated in this same time frame: EDIFACT by the 
United Nations and ASC X12N by ANSI to 
address standards for claims and billing, IEEE 
for device standards, ACR/NEMA (later 
DICOM) for imaging standards, and NCPDP 
for prescription standards. Internationally, 
the 1990s saw the creation of the European 
Normalization Committee (CEN) and 
ISO Technical Committee 215 for Health 
Informatics (TC251). These organizations are 
described in more detail in > Sect. 8.3.2. 


7.2.2 Transitioning Standards 
to Meet Present Needs 


Early standards were usually applied within a 
single unit or department in which the stan- 
dards addressed mainly local requirements. 
Even then, data acquired locally came from 
another source introducing the need for addi- 
tional standards. These many pressures caused 
health care information systems to change 
the status quo such that data collected for a 
primary purpose could be reused in a multi- 
tude of ways. Newer models for health care 
delivery, such as integrated delivery networks, 
health maintenance organizations (HMOs), 
preferred provider organizations (PPOs), and 
now accountable care organizations (ACOs) 
have increased the need for coordinated, 
integrated, and consolidated information 
(see > Chap. 15), even though the informa- 
tion comes from disparate departments and 


institutions. Various management techniques, 
such as continuous quality improvement and 
case management, require up-to-date, accu- 
rate abstracts of patient data. Post hoc analy- 
ses for clinical and outcomes research require 
comprehensive summaries across patient 
populations. Advanced tools, such as clini- 
cal workstations (® Chap. 5) and decision- 
support systems (> Chap. 3), require ways to 
translate raw patient data into generic forms 
for tasks as simple as summary reporting and 
as complex as automated medical diagnosis. 
All these needs must be met in the existing 
setting of diverse, interconnected information 
systems—an environment that cries out for 
implementation of standards. 

One obvious need is for standardized 
identifiers for individuals, health care provid- 
ers, health plans, and employers so that such 
participants can be recognized across systems. 
Choosing such an identifier is much more 
complicated than simply deciding how many 
digits the identifier should have. Ideal attri- 
butes for these sets of identifiers have been 
described in a publication from the ASTM 
(1999). The identifier must include a check 
digit to ensure accuracy when the identifier 
is entered by a human being into a system. 
A standardized solution must also determine 
mechanisms for issuing identifiers to individu- 
als, facilities, and organizations; for maintain- 
ing databases of identifying information; and 
for authorizing access to such information 
(also see > Chap. 5). 

The Centers for Medicare and Medicaid 
Services (CMS), has defined a National 
Provider Identifier (NPI) as a national stan- 
dard. This number is a seven-character alpha- 
numeric base identifier plus a one-character 
check digit. No meaning is built into the num- 
ber, each number is unique, it is never reissued, 
and alpha characters that might be confused 
with numeric characters have been eliminated 
(e.g., 0, 1, 2, 4, and 5 can be confused with O, 
Ior L, Z, Y, and S). CMS was tasked to define 
a Payer ID for identifying health care plans. 
The Internal Revenue Service’s employer 
identification number has been adopted as the 
Employer Identifier. 

The most controversial issue is identify- 
ing each individual or patient. Many people 
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consider assignment and use of such a num- 
ber to be an invasion of privacy and are con- 
cerned that it could be easily linked to other 
databases. Public Law 104-191, passed in 
August 1996 (see > Sect. 8.3.2), required that 
Congress formally define suitable identifiers. 
Pushback by privacy advocates and negative 
publicity in the media resulted in Congress 
declaring that this issue would not be moved 
forward until privacy legislation was in place 
and implemented (see » Chap. 14). The 
Department of Health and Human Services 
has recommended the identifiers discussed 
above, except for the person identifier. This 
problem has still not been resolved, although 
the momentum for creating such a unique 
personal identified seems to be increasing. 
A work-around non-solution is algorithmic 
patient matching based on electronic health 
record (EHR) data. The United States is one 
of the few developed countries without such 
an identifier Stead et al. 2005). 


7.2.3 Settings Where Standards 
Are Needed 


The patient care process, which can be varied 
and complicated, also include numerous pro- 
cesses that can be improved with standardiza- 
tion. A hospital admissions system records 
that a patient has the diagnosis of diabetes 
mellitus, a pharmacy system records that the 
patient has been given gentamicin, a labora- 
tory system records that the patient had cer- 
tain results on kidney function tests, and a 
radiology system records that a doctor has 
ordered an X-ray examination for the patient 
that requires intravenous iodine dye. Other 
systems need ways to store these data, to pres- 
ent the data to clinical users, to send warn- 
ings about possible drug-drug interactions, 
to recommend dosage changes, and to follow 
the patient’s outcome. A standard for coding 
patient data is nontrivial when one consid- 
ers the need for agreed-on definitions, use of 
qualifiers, differing (application-specific) lev- 
els of granularity in the data, and synonymy, 
not to mention the breadth and depth that 
such a standard would need to have. 
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The inclusion of medical knowledge in 
clinical systems is becoming increasingly 
important and commonplace. Sometimes, the 
knowledge is in the form of simple facts such 
as the maximum safe dose of a medication or 
the normal range of results for a laboratory 
test. Much medical knowledge is more com- 
plex, however. It is challenging to encode such 
knowledge in ways that computer systems can 
use (see > Chap. 26), especially if one needs 
to avoid ambiguity and to express logical rela- 
tions consistently. Thus the encoding of clini- 
cal knowledge using an accepted standard 
would allow many people and institutions to 
share the work done by others. One standard 
designed for this purpose is the Arden Syntax, 
discussed in » Chap. 3, as well as the HL7 
standard Clinical Quality Language (Odigie 
et al. 2019). 

Because the tasks we have described 
require coordination of systems, methods 
are needed for transferring information from 
one system to another. Such transfers were 
traditionally accomplished through custom- 
tailored point-to-point interfaces, but this 
technique has become unworkable as the 
number of systems and the resulting permu- 
tations of necessary connections have grown. 
A current approach to solving the multiple- 
interface problem is through the development 
of messaging standards. Such messages must 
depend on the preexistence of standards for 
patient identification and data encoding. 

Over the past decade, non-healthcare 
domains such as travel, package delivery and 
e-commerce have adopted, implemented and 
published standard application programing 
interfaces (APIs) in order to streamline their 
business processes and improve efficiency. The 
adoption of open APIs especially the HL7 
Fast Healthcare Interoperability Resources 
(FHIR®) has increased dramatically and 
cited proposed regulations as an enabler of 
improved data sharing (Braunstein 2018). 

Data sharing has become an expected 
functionality for any health IT system. Many 
of the new initiatives in health require data 
sharing. Data sharing is essential not only for 
patient care, but for aggregating data across 
multiple sites for research. Security must also 
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be addressed before such exchanges can be 
allowed to take place. Before a system can 
divulge patient information, it must ensure 
that requesters are who they say they are and 
that they are permitted access to the requested 
information (see > Chap. 5). Standards exist 
for this functionality. Although each clinical 
system can have its own security features, sys- 
tem builders would rather draw on available 
standards and avoid reinventing the wheel. 
Besides, the secure exchange of information 
requires that interacting systems use stan- 
dard technologies. Electronic Health Record 
systems (EHRs) are increasingly adopt- 
ing standard authorization (OAuth2) and 
identification (OpenID) by implementing 
Substitutable Medical Applications Reusable 
Technology (SMART) on FHIR which allows 
platform independent applications to be 
launched within the EHR workflow and uti- 
lize EHR data via FHIR (Payne et al. 2015). 


7.3 Standards Undertakings 


and Organizations 


It is helpful to separate our discussion of the 
general process by which standards are cre- 
ated from our discussion of the specific orga- 
nizations and the standards that they produce. 
The process is relatively constant, whereas 
the organizations form, evolve, merge, and 
are disbanded. This section will discuss how 
standards are created then identify the many 
SDOs and an overview of the types of stan- 
dards they create. This section will also iden- 
tify other groups and organizations that 
contribute or relate to standards activities. 


7.3.1 The Standards Development 
Process 


The process of creating standards is biased 
and highly competitive. Most standards are 
created by volunteers who represent multiple, 
disparate stakeholders. They are influenced 
by direct or indirect self-interest rather than 
judgment about what is best or required. The 
process is generally slow and inefficient; mul- 
tiple international groups create competitive 


standards; and new groups continue to be 

formed as they become aware of the need for 

standards and do not look to see what stan- 
dards exist. Yet, the process of creating stan- 
dards largely works, and effective standards 
are created. 

There are four ways in which a standard 
can be produced: 

1. Ad hoc method: A group of interested 
people and organizations (e.g., laboratory- 
system and hospital-system vendors) agree 
on a standard specification. These specifi- 
cations are informal and are accepted as 
standards through mutual agreement of 
the participating groups. A standard 
example produced by this method is the 
DICOM standard for medical imaging. 

2. De facto method: A single vendor controls 
a large enough portion of the market to 
make its product the market standard. An 
example is Microsoft’s Windows. A more 
recent example are the Argonaut 
Implementation Guides.* In this case, a 
collaborative of vendors and academic 
health systems are creating consensus 
standards for their requirements. 

3. Government-mandate method: A govern- 
ment agency, such as CMS or the National 
Institute for Standards and Technology 
(NIST) creates a standard and legislates its 
use. An example is the HIPAA standard. 
Another example is the Consolidated 
Clinical Data Architecture (CCDA), a 
standard that resulted from the US 
Government’s creating a set of require- 
ments and driving a standard to meet 
those requirements. 

4. Consensus method: A group of volunteers 
representing interested parties works in an 
open process to create a standard. Most 
health care standards are produced by this 
method. An example is the Health Level 7 
(HL7) standard for clinical-data inter- 
change (© Fig. 7.1). 


4 > http://fhir.org/guides/argonaut/ 
12/2/19) 

5 > https://www.healthit. gov/topic/standards-tech- 
nology/consolidated-cda-overview (accessed 
12/2/19) 


(accessed 
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O Fig. 7.1 Standards development meetings. The 
development of effective standards often requires the 
efforts of dedicated volunteers, working over many 
years. Work is often done in small committee meetings 
and then presented to a large group to achieve consen- 
sus. Here we see meetings of the HL7. Vocabulary Tech- 
nical Committee (fop) and an HL7 plenary meeting 
(bottom). See » Sect. 8.5.2 for a discussion on HL7; 
Photo courtesy of Ken Rubin Photography 


The process of creating a standard proceeds 
through several stages (Libicki 1995). It begins 
with an identification stage, during which 
someone becomes aware that there exists a 
need for a standard in some area and that tech- 
nology has reached a level that can support 
such a standard. For example, suppose there 
are several laboratory systems sending data to 
several central hospital systems—a standard 
message format would allow each laboratory 
system to talk to all the hospital systems with- 
out specific point-to-point interface programs 
being developed for each possible laboratory- 
to-laboratory or laboratory-to-hospital com- 
bination. If the time for a standard is ripe, 
then several individuals can be identified and 
organized to help with the conceptualization 
stage, in which the characteristics of the stan- 
dard are defined. What must the standard do? 
What is the scope of the standard? What will 
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be its format? In the early years of standards 
development, this approach led the develop- 
ment of standards, and the process was sup- 
ported by vendors and providers. 

As those early standards have become 
successful, the need for “gap-standards” has 
arisen. These gap-standards have no cham- 
pion but are necessary for completeness of 
an interoperable data exchange network. The 
need for these standards is not as obvious 
as for the primary standards, people are less 
likely to volunteer to do work, putting stress 
on the voluntary approach. In such cases, the 
need of such standards must be sold to the 
volunteers or developed by paid professionals. 

Let us consider, for purposes of illustra- 
tion, how a standard might be developed for 
sending laboratory data in electronic form 
from one computer system to another in the 
form of a message. The volunteers for the 
development might include laboratory system 
vendors, clinical users, and consultants. One 
key discussion would be on the scope of the 
standard. Should the standard deal only with 
the exchange of laboratory data, or should 
the scope be expanded to include other types 
of data exchange? Should the data elements 
being exchanged be sent with a XML tag 
identifying the data element, or should the 
data be defined positionally? In the ensuing 
discussion stage, the participants will begin to 
create an outline that defines content, identi- 
fies critical issues, and produces a timeline. In 
the discussion, the pros and cons of the vari- 
ous concepts are discussed. What will be the 
specific form for the standard? For example, 
will it be message based or document based? 
Will the data exchange be based on a query 
or on a trigger event? Will the standard define 
the message content, the message syntax, the 
terminology, the network protocol, or a sub- 
set of these issues? How will a data model or 
information model be incorporated? 

The participants are generally well 
informed in the domain of the standard, so 
they appreciate the needs and problems that 
the standard must address. Basic concepts 
are usually topics for heated discussion; sub- 
sequent details may follow at an accelerated 
pace. Many of the participants will have expe- 
rience in solving problems to be addressed 
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by the standard and will protect their own 
approaches. The meanings of words are often 
debated. Compromises and loosely defined 
terms are often accepted to permit the pro- 
cess to move forward. For example, the likely 
participants would be vendors of competing 
laboratory systems and vendors of competing 
HISs. All participants would be familiar with 
the general problems but would have their 
own proprietary approach to solving them. 
Definitions of basic concepts normally taken 
for granted, such as what constitutes a test or 
a result, would need to be clearly stated and 
agreed on. 

The writing of the draft standard is usu- 
ally the work of a few dedicated individuals— 
typically people who represent the vendors in 
the field. Other people then review that draft; 
controversial points are discussed in detail and 
solutions are proposed and finally accepted. 
Writing and refining the standard is further 
complicated by the introduction of people 
new to the process who have not been privy 
to the original discussions and who want to 
revisit points that have been resolved earlier. 

The balance between moving forward and 
being open is a delicate one. Most standards- 
writing groups have adopted an open standards 
development policy: Anyone can join the pro- 
cess and can be heard. Most standards devel- 
opment organizations—certainly those by 
accredited groups— support an open ballot- 
ing process. A draft standard is made available 
to all interested parties, inviting comments 
and recommendations. All comments are con- 
sidered. Negative ballots must be addressed 
specifically. If the negative comments are per- 
suasive, the standard is modified. If they are 
not, the issues are discussed with the submit- 
ter in an attempt to convince the person to 
remove the negative ballot. If neither of these 
efforts is successful, the comments are sent to 
the entire balloting group to see whether the 
group is persuaded to change its vote. The 
resulting vote then determines the content of 
the standard. Issues might be general, such 
as deciding what types of laboratory data to 
include (pathology? blood bank?), or specific, 
such as deciding the specific meanings of spe- 


cific fields (do we include the time the test was 
ordered? specimen drawn? test performed?). 

A standard will generally go through sev- 
eral versions on its path to maturity. The first 
attempts at implementation are frequently 
met with frustration as participating ven- 
dors interpret the standard differently and 
as areas not addressed by the standard are 
encountered. These problems may be dealt 
with in subsequent versions of the standard. 
Backward compatibility is a major concern as 
the standard evolves. How can the standard 
evolve, over time, and still be economically 
responsible to both vendors and users? An 
implementation guide is usually produced to 
help new vendors profit from the experience 
of the early implementers. 

Connectathons have become increasingly 
important in the standards development. In 
the past, standards development and stan- 
dards implementation have been generally 
separated. Today, standards are tested during 
one or two day connectathons where imple- 
menters bring client and server applications to 
test against one another. The successes rein- 
force the validity of the standard while fail- 
ures identify gaps and errors which need to be 
addressed. 

A critical stage in the life of a standard is 
early implementation, when acceptance and 
rate of implementation are important to suc- 
cess. This process is influenced by accredited 
standards bodies, by the federal government, by 
major vendors, and the marketplace. The main- 
tenance and promulgation of the standard are 
also important to ensure widespread availabil- 
ity and continued value of the standard. Some 
form of conformance testing is ultimately nec- 
essary to ensure that vendors adhere to the 
standard and to protect its integrity. 

Producing a standard is an expensive pro- 
cess in terms of both time and money. Vendors 
and users must be willing to support the many 
hours of work, usually on company time; the 
travel expense; and the costs of documenta- 
tion and distribution. In the United States, 
the production of a consensus standard is 
voluntary, in contrast to in Europe and else- 
where, where most standards development 


Standards in Biomedical Informatics 


is funded by governments. In the US, a new 
model for funding standards development 
work has emerged. The Da Vinci Project° and 
the CARIN Alliance’ are two collaboratives 
funded by payers and technology vendors 
to address interoperability needs between 
patients, providers and payers. These group 
share the cost of standards development and 
benefit from the accelerated pace. 

An important aspect of standards is con- 
formance, a concept that covers compliance 
with the standard and also usually includes 
specific agreements among users of the stan- 
dard who affırm that specific rules will be fol- 
lowed. A conformance document identifies 
specifically what data elements will be sent, 
when, and in what form. Even with a perfect 
standard, a conformance document is neces- 
sary to define business relationships between 
two or more partners. Unlike past standards, 
recent standards have built conformance test- 
ing directly into the standard artifacts which 
are not only human readable documents but 
also machine readable. 

The creation of the standard is only the 
first step. Ideally the first standard would be 
a Standard for Trial Use (STU), and two or 
more vendors would implement and test the 
standard to identify problems and issues. 
Those items would be corrected, and in a 
short period of time (usually 1 year), the 
standards would be advanced to a normative 
stage. A normative standard specifies to what 
implementers must conform. Even then, the 
process is only beginning. Implementation 
that conforms to the standard is essential if 
the true value of the standard is to be realized. 
The use of most standards is enhanced by a 
certification process in which a neutral body 
certifies that a vendor’s product, in fact, does 
comply and conform to the standard. 

There is currently no body that certifies 
conformance of specific standards from a ven- 


6 » http://www.hl7.org/about/davinci/ 
12/2/19) 

7» https://www.carinalliance.com/ 
12/2/19) 


(accessed 
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dor. There is, however, the certification of an 
application that uses standards. In 2010, the 
Office of the National Coordinator (ONC)® 
engaged with the CCHIT to certify EHR 
products. That certification process evolved in 
2011 to include eight groups that could cer- 
tify EHR products, and to date over 500 EHR 
products have been certified. The certification 
process is still undergoing change. 


7.3.2 Data Standards Organizations 


Sometimes, standards are developed by orga- 
nizations that need the standard to carry 
out their principal functions; in other cases, 
coalitions are formed for the express purpose 
of developing a particular standard. The lat- 
ter organizations are discussed later, when 
we examine the particular standards devel- 
oped in this way. There are also standards 
organizations that exist for the sole purpose 
of fostering and promulgating standards. In 
some cases, they include a membership with 
expertise in the area where the standard is 
needed. In other cases, the organization pro- 
vides the rules and framework for standard 
development but does not offer the expertise 
needed to make specific decisions for specific 
standards, relying instead on participation by 
knowledgeable experts when a new standard 
is being studied. 

This section describes several of the best 
known and most influential health-related 
SDOs. Since most standards continue to 
evolve to accommodate changes in technol- 
ogy, policy, regulations, and requirements, 
links are provided to selected standards and 
SDO information. For a detailed understand- 
ing of an organization or the standards it has 
developed, you will need to refer to current 
primary resources. Many of the organizations 
maintain Web sites with excellent current 
information on their status. 


8 > http://www.healthit.gov/newsroom/about-onc 
(accessed 12/2/19) 
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7.3.2.1 American National Standards 
Institute 

ANSI is a private, nonprofit membership 
organization founded in 1918. It originally 
served to coordinate the U.S. voluntary cen- 
sus standards systems. Today, it is responsible 
for approving official American National 
Standards. ANSI membership includes over 
1100 companies; 30 government agencies; and 
250 professional, technical, trade, labor, and 
consumer organizations. 

ANSI does not write standards; rather, it 
assists standards developers and users from 
the private sector and from government to 
reach consensus on the need for standards. It 
helps them to avoid duplication of work, and 
it provides a forum for resolution of differ- 
ences. ANSI administers the only government- 
recognized system for establishing American 
National Standards. ANSI also represents 
USS. interests in international standardization. 
ANSI is the U.S. voting representative in the 
ISO and the International Electrotechnical 
Commission (IEC). There are three routes 
for a standards development body to become 
ANSI approved so as to produce an American 
National Standard: Accredited Organization; 
Accredited Standards Committee (ASCs); 
and Accredited Canvass. 

An organization that has existing organi- 
zational structure and procedures for stan- 
dards development may be directly accredited 
by ANSI to publish American National 
Standards, provided that it can meet the 
requirements for due process, openness, 
and consensus. HL7 (discussed in » Sect. 
8.5.2) is an example of an ANSI Accredited 
Organization. 


73.2.2 ASC X12 


ANSI may also create internal ASCs to meet 
a need not filled by an existing Accredited 
Organization. ASC X12 is an example of such 
a committee. 

The final route, Accredited Canvass, 
is available when an organization does 
not have the formal structure required by 
ANSI. Through a canvass method that meets 
the criterion of balanced representation 
of all interested parties, a standard may be 


approved as an American National Standard. 
X12 develops transaction sets that transcend 
across a broad range of business domains. 

Link: > www.x12.org 

Link to transaction sets: » http://www. 
x12.org/x12-work-products/x12-transaction- 
sets.cfm 

Link to EDI standards: > http://www.x12. 
org/x12-work-products/x12-edi-standards. 
cfm 


7.3.2.3 ASTM International 


ASTM (formerly known as the American 
Society for Testing and Materials) develops 
standard test methods for materials, prod- 
ucts, systems, and services. ASTM is the 
largest non-government of standards in the 
US. ASTM Committee E31 on Computerized 
Systems is responsible for the development 
of the medical information standards. The 
scope of this committee is the promotion of 
knowledge and development of standard clas- 
sifications, guides, specifications, practices, 
and terminology for the architecture, content, 
storage and communication of information 
used within healthcare, including patient- 
specific information and medical knowledge. 
Standard also address policies for integrity 
and confidentiality and computer procedures 
that support the uses of data and healthcare 
decision making. 

Link to E31 Standards: » https://www. 
astm.org/COMMITTEE/E31.htm 


7.3.2.4 Clinical Data Interchange 
Standards (CDISC) 


CDISC creates standards in support of the 
clinical research community. Its member- 
ship includes pharma, academic researchers, 
vendors and others. CDISC was created in 
2000 in order to facilitate electronic regula- 
tory submission of clinical trial data. The cur- 
rent standards include a study data model, a 
data analysis model, a lab data model and an 
operational data model that supports audit 
trails and metadata. In 2007, CDISC began a 
collaborative project with HL7, the National 
Institutes of Health and the US FDA to 
link research data with data derived from 
clinical care. This modeling effort, BRIDG 
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(Biomedical Research Integrated Domain 
Group; Becnel et al. 2017), has created a 
domain analysis model of clinical research. 

Link: > www.cdisc.org 

Link to standards: > www.cdisc.org/mod- 
els/sds/v2.0/index.html 

Link to BRIDG: > http://www.hl7.org/ 
Special/committees/bridg/index.cfm 


7.3.2.5 Digital Imaging 
and Communications 
in Medicine (DICOM) 
DICOM was created through a joint effort 
by the American College of Radiology 
and National Electrical Manufacturers 
Association (NEMA) to develop standards 
for imaging and waveforms. The DICOM 
standard has been developed with an empha- 
sis on diagnostic medical imaging as practiced 
in radiology, cardiology, pathology, dentistry, 
ophthalmology and related disciplines, and 
image-based therapies such as interventional 
radiology, radiotherapy and surgery. 
Link: > www.dicomstandard.org 
Link to standard: > www.dicomstandard. 
org/current 
Link to history: > www.dicomstandard. 
org/history/ 


7.3.2.6 European Committee 

for Standardization Technical 

Committee 251 
The European Committee for Standardization 
(CEN) established, in 1991, Technical 
Committee 251 (TC 251—not to be con- 
fused with ISO TC 215 described below) for 
the development of standards for health care 
informatics. The major goal of TC 251 is to 
develop standards for communication among 
independent medical information systems 
so that clinical and management data pro- 
duced by one system could be transmitted to 
another system. The organization of TC 251 
parallels efforts in the United States through 
various working groups. These groups simi- 
larly deal with data interchange standard, 
medical record standards, code and terminol- 
ogy standards, imaging standards, and secu- 
rity, privacy and confidentiality. 
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CEN has made major contributions to 
data standards in health care. One important 
CEN pre-standard ENV 13606 on the elec- 
tronic health record (EHR) is being advanced 
by CEN as well as significant input from 
Australia and the OpenEHR Foundation. 
There is an increasing cooperation among 
the CEN participants and several of the U.S. 
standards bodies. 

Link: > http://www.ehealth-standards.eu/ 

Link to projects: » http://www.ehealth- 
standards.eu/en/projects/ 


73.2.7 GS1 


GSI (for “Global Standard 1”) is a global 
standards organization that develops and 
maintains global standards for business com- 
munication. With over 1.5 million members 
worldwide, it has a presence in over 100 coun- 
tries. Its primary standards relate to the sup- 
ply chain and for assigning object identifiers 
and standards for barcodes. GS1 standards 
are designed to improve the efficiency, safety 
and visibility of supply chains across physical 
and digital channels in 25 sectors. They form a 
business language that identifies, captures and 
shares key information about products, loca- 
tions, assets and more. 
Link: > www.gsl.org 


7.3.2.8 Health Level Seven 
International (HL7) 
Health Level 7 was founded as an ad hoc 
standards group in March 1987 to create stan- 
dards for the exchange of clinical data, adopt- 
ing the name “HL7” to reflect the application 
(seventh) level of the OSI reference model. 
The primary motivation was the creation of 
a Hospital Information System from “Best 
of Breed” components. The HL7 data inter- 
change standard (version 2.n series) reduced 
the cost of interfacing between disparate 
systems to an affordable cost. Today HL7 is 
one of the premier SDOs in the world. It has 
become an international standards body with 
approximately 40 Affiliates, over 500 organi- 
zational members and over 2200 individual 
members. HL7 is ANSI accredited, and many 
of the HL7 standards are required by the 
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U.S. government as part of the certification 
requirements of Meaningful Use. 

Link: » www.hl7.org 

Link to HL7 FHIR: » www.hl7.org/ 
FHIR 


7.3.2.9 Institute of Electrical 
and Electronic Engineering 
(IEEE) 
IEEE is an international SDO organiza- 
tion that is a member of both ANSI and 
ISO. Through IEEE, many of the world’s 
standards in telecommunications, electronics, 
electrical applications, and computers have 
been developed. 

IEEE 1073, Standard for Medical Device 
Communications, has produced a family of 
documents that defines the entire seven-layer 
communications requirements for the Medical 
Information Bus (MIB; Gottschalk 1991). The 
MIB is a robust, reliable communication ser- 
vice designed for bedside devices in the inten- 
sive care unit, operating room, and emergency 
room. These standards have been harmonized 
with work in CEN, and the results are released 
as ISO standards. IEEE and HL7 have col- 
laborated on several key standards, including 
those for mobile medical devices. 

Link: > www.ieee.org 

Link to standards: » https://standards. 
ieee.org/standard/index.html 


ISO Technical Committee 
215—Health Informatics 

In 1989, interests in the European Committee 

for Standardization (CEN) and the United 

States led to the creation of Technical 

Committee (TC) 215 for Health Information 

within ISO. 

TC 215 meets once in a year asa TC and 
once as a Joint Working Group. TC 215 fol- 
lows rather rigid procedures to create ISO 
standards. Thirty-five countries are active par- 
ticipants in the TC with another 23 countries 
acting as observers. While the actual work 
is done in the working groups, the balloting 
process is very formalized—one vote for each 
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participating country. For most work, there are 
a defined series of steps, beginning with a New 
Work Item Proposal and getting five coun- 
tries to participate; a Working Document, a 
Committee Document; a Draft International 
Standard, a Final Draft International Standard 
(FDIS); and finally an International Standard. 
This process, if fully followed, takes several 
years to produce an International Standard. 
Under certain conditions, a fast track to FDIS 
is permitted. Technical Reports and Technical 
Specifications are also permitted. 

The United States has been assigned the 
duties of Secretariat, and that function is car- 
ried out, at this time, by ANSI. ANSI also 
serves as the U.S. Technical Advisory Group 
Administrator, which represents the U.S. posi- 
tion in ISO. 


Link: » https://www.iso.org/standards. 
html 
7.3.2.11 Integrating the Healthcare 


Enterprise 

The goal of the Integrating the Healthcare 
Enterprise (IHE) initiative is to stimulate inte- 
gration of healthcare information resources. 
IHE is sponsored jointly by the Radiological 
Society of North America (RSNA) and the 
HIMSS. Using established standards and 
working with direction from medical and 
information technology professionals, indus- 
try leaders in healthcare information and 
imaging systems cooperate under IHE to 
agree upon implementation profiles for the 
transactions used to communicate images 
and patient data within the enterprise. Their 
incentive for participation is the opportunity 
to demonstrate that their systems can operate 
efficiently in standards-based, multi-vendor 
environments with the functionality of real 
HISs. Moreover, IHE enables vendors to 
direct product development resources toward 
building increased functionality rather than 
redundant interfaces. 

Link: » https://www.ihe.net/ 

Link to IHE domains: » https://www.ihe. 
net/ihe_domains/ 
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7.3.2.12 National Council 
for Prescription Drug 
Program (NCPDP) 
NCPDP is a not-for-profit, multi-stakeholder 
forum for developing and promoting industry 
standards and business solutions that improve 
patient safety and health outcomes, while 
also decreasing costs. NCPDP is an ANSI 
accredited SDO and uses a consensus-build- 
ing process to create national standards for 
real-time, electronic exchange of healthcare 
information. Their primary focus is on infor- 
mation exchange for prescribing, dispensing, 
monitoring, managing and paying for medica- 
tions and pharmacy services crucial to quality 
healthcare. 
Link: > www.ncpdp.org/home 
Link to standards: » www.ncpdp.org/ 
standards-Development 


7.3.2.13 OpenEHR 
OpenEHR is the name of a technology for 
e-health, consisting of open specifications, 
clinical models and software that can be used 
to create standards and build information and 
interoperability solutions for healthcare. The 
various artefacts of openEHR are produced 
by the openEHR community and managed by 
the openEHR Foundation, an international 
non-profit organization established in the year 
2003. 

Link: > openehr.org/ 

Link to Clinical Models: > www.openehr. 
org/clinicalmodels 

Link to Clinical Knowledge Manager: 
> www.openehr.org/ckm 

Link to Software Programs: 
openehr.org/programs/software 

Link to Specification Program: > openehr. 
org/programs/specification 


> www. 


7.3.2.14 Personal Connected Health 
Alliance (PCHA) 

The PCHA publishes the Continua Design 
Guidelines to enable a flexible implementa- 
tion framework for end-to-end interoperabil- 
ity of personal connected health devices and 
systems. These Guidelines are recognized by 
the International Telecommunications Union 
(ITU) as the international standard for safe, 
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secure and reliable exchange of data to and 
from personal health devices. 

Link to guidelines: » www.pchalliance. 
org/continua-design-guidelines 


7.3.2.15 SNOMED International 


SNOMED International (previously called 
IHTSDO) was founded in 2007 with nine 
charter members. Currently 19 countries, 
including the United States, are members. 
The primary purpose of IHTSDO is the con- 
tinued development and maintenance of the 
Systematized Nomenclature of Medicine — 
Clinical Terms (SNOMED-CT; see » Sect. 
7.4.4.4, below). Member countries make 
SNOMED-CT freely available to its citi- 
zens. SNOMED has a number of Special 
Interest Groups: Anesthesia, Concept Model, 
Education, Implementation, International 
Pathology & Laboratory Medicine, Mapping, 
Nursing, Pharmacy, and Translation. 

Link: » http://www.snomed.org/ 


7.3.2.16 OHDSI 


The Observational Health Data Sciences and 
Informatics (OHDSI) program is a multi- 
stakeholder, interdisciplinary open-source 
collaborative to leverage the value of health 
data through large-scale analytics. OHDSI 
has established an international network of 
researchers and observational health data- 
bases. ODHSI enables active engagement 
across multiple disciplines, including clinical 
medicine, biostatistics, computer science, epi- 
demiology, and life sciences (@ Fig. 7.2). 
Link: > https://www.ohdsi.org/ 


Detailed Clinical Models, 
Coded Terminologies, 
Nomenclatures, 

and Ontologies 


7.4 


The capture, storage, and use of clinical data 
in computer systems is complicated by lack of 
agreement on terms and meanings. In recent 
years there has also been a growing recogni- 
tion that just standardizing the terms and 
codes used in medicine is not sufficient to 
enable interoperability. The structure or form 
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Breakdown of OHDSI concepts by domain, standard class, and vocabulary 
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International Vocabularies, 


G Fig. 7.2 Extensive 
George Hripscak, MD (by permission). A critical chal- 
lenge to data analytics across international domains 


of medical data provides important context 
for computable understanding of the data. 
Terms and codes need to be interpreted in 
the context of clinical information models. 
The many terminologies and detailed clinical 
modeling activities discussed in this section 
have been developed to ease the communica- 
tion of coded medical information. 


Motivation for Structured 
and Coded Data 


7.4.1 


The structuring and encoding of medical 
information is a basic function of most clini- 
cal systems. Standards for such structuring 
and encoding can serve two purposes. First, 
they can save system developers from reinvent- 
ing the wheel. For example, if an application 
allows caregivers to compile problem lists 
about their patients, using a standard structure 
and terminology saves developers from having 


Extensive vocabularies 


Colorey 
Furstivocabuiany ja) 


continues to be the requirement for integrating data 
sources among multiple vocabularies. (Photo courtesy 
of George Hripcsak, MD, with permission) 


to create their own. Second, using commonly 
accepted standards can facilitate exchange of 
data, applications, and clinical decision sup- 
port logic among systems. For example, if a 
central database is accepting clinical data from 
many sources, the task is greatly simplified if 
each source is using the same logical data struc- 
ture and coding scheme to represent the data. 
System developers often ignore available stan- 
dards and continue to develop their own solu- 
tions. It is easy to believe that the developers 
have resisted adoption of standards because it 
is too much work to understand and adapt to 
any system that was “not invented here.” The 
reality, however, is that the available standards 
are often inadequate for the needs of the users 
(in this case, system developers). As a result, no 
standard terminology enjoys the wide accep- 
tance sufficient to facilitate the second func- 
tion: exchange of coded clinical information. 
The need for detailed clinical models 
is directly related to the second goal dis- 
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cussed above, that of creating interoperabil- 
ity between systems. The subtle relationship 
between terminologies and models is best 
understood using a couple of examples. If 
a physician wants to record the idea that a 
patient had “chest pain that radiates to the 
back”, the following coded terms could be 
used from SNOMED-CT. 


7.4.2 Detailed Clinical Models 


The creation of unambiguous data represen- 
tation is a combination of creating appropri- 
ate structures (models) for representing the 
form of the data and then linking or “bind- 
ing” specific sets of codes to the coded ele- 
ments in the structures. Several modeling 
languages or formalisms have been found to 
be useful in describing the structure of the 
data. They include: 

= UML - the Unified Modeling Language, 

Object Management Group 
= ADL - Archetype Definition Language, 


OpenEHR Foundation 

= CDL - Constraint Definition Language, 
General Electric and Intermountain 
Health care 


= MIF - Model Interchange Format, Health 
Level Seven International Inc. 

= OWL - Web Ontology Language, World 
Wide Web Consortium 


All languages used for clinical modeling need 
to accomplish at least two major things: they 
need to show the “logical” structure of the 
data, and they need to show how sets of codes 
from standard terminologies participate in 
the logical structure. Defining the logical 
structure is simply showing how the named 
parts of a model relate to one another. Model 
elements can be contained in other elements, 
creating hierarchies of elements. It is also 
important to specify which elements of the 
model can occur more than once (cardinal- 
ity), which elements are required, and which 
are optional. Terminology binding is the act 
of creating connections between the elements 
in a model and concepts in a coded terminol- 
ogy. For each coded element in a model, the 
set of allowed values for the coded element 
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are specified. The HL7 Vocabulary Working 
Group has created a comprehensive discus- 
sion of how value sets can be defined and used 
with information models. 

There are many clinical information mod- 
eling activities worldwide. Some of the most 
important activities are briefly listed below. 
= HL7 Activities 

— HL7 Detailed Clinical Models — This 
group has developed a method for spec- 
ifying clinical models based on the HL7 
Reference Information Model (RIM) 
that guarantees that data that conforms 
to the model could be sent in HL7 
Version 3 messages. 

— HL7 Clinical Document Architecture 
(CDA) Templates — This group has 
defined a standard way of specifying 
the structure of data to be sent in XML 
documents that conform to the CDA 
standard. 

— HL7 TermInfo — This Workgroup of 
HL7 has specified a set of guidelines for 
how SNOMED-CT codes and concepts 
should be used in conjunction with the 
HL7 RIM to represent data sent in HL7 
Version 3 messages. 

— HL7 Clinical Information Modeling 
Initiative — A formerly independent 
group now organized as and HL7 
Workgroup is chartered to develop 
implementable clinical information 
models. 

= The openEHR Foundation is developing 
models based on a core reference model 
and the Archetype Definition Language. 
This approach has been adopted by several 
national health information programs. 
= EN 13606 is developing models based on 
the ISO/CEN 13606 standard and core 
reference model. 

= The US Veterans Administration (VA) is 
creating models for integrating data across 
all VA facilities and for integration with 
military hospitals that are part of the US 

Department of Defense. The modeling is 

done primarily using Unified Modeling 

Language. 
= US Department of Defense is creating 

models for integrating data across all DoD 

facilities and for integration with VA 
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facilities. The modeling is done primarily 
using Unified Modeling Language. 

= The National Health Service in the United 
Kingdom (UK) is developing the Logical 
Record Architecture to provide models for 
interoperability across all health care 
facilities in the UK. The modeling is done 
primarily using UML. 

= Clinical Element Models — Intermountain 
Health care and General Electric have 
created a set of detailed clinical models 
using a core reference model and 
Constraint Definition Language. The 
models are free-for-use and are available 
for download from the Internet. 

= SHARE Models — CDISC is creating 
models to integrate data collected as part 
of clinical trials. 

= SMART Team — This group at Boston 
Children’s Hospital is defining standard 
application programming interfaces 
(APIs) for securely connecting with EHRs. 

= Clinical Information Modeling Initiative 
(CIMI) - This is an international 
consortium that has the goal of establishing 
a free-for-use repository of detailed 
clinical models, where the models are 
expressed in a single common modeling 
language with explicit bindings to standard 
terminologies. 

= OMOP - The OMOP Common Data 
Model enables the systematic analysis of 
disparate observational claims-based data. 
The initial approach was to transform data 
contained within claims databases into a 
common format (data model) as well as a 
common representation (terminologies, 
vocabularies, coding schemes). OMOP has 
been primarily used by the OHDSI 
community (see » Sect. 7.3.2.16), but is 
being examined by other consortia seeking 
to share patient-level clinical information. 


7.4.3 Vocabularies, Terminologies, 
and Nomenclatures 


In discussing coding systems, the first step is to 
clarify the differences among a terminology, a 
vocabulary, and a nomenclature. These terms 


are often used interchangeably by creators 
of coding systems and by authors discussing 
the subject. Fortunately, although there are 
few accepted standard terminologies, there 
is a generally accepted standard about termi- 
nology: ISO Standard 1087 (Terminology— 
Vocabulary). 

Finally, we should consider the methods by 
which the terminology is maintained. Every 
standard terminology must have an ongoing 
maintenance process, or it will rapidly become 
obsolete. The process must be timely and must 
not be too disruptive to people using an older 
version of the terminology. For example, if the 
creators of the terminology choose to rename 
a code, what happens to the data previously 
recorded with that code? 


7.4.4 Specific Terminologies 


With these considerations in mind, let us sur- 
vey some of the available controlled terminol- 
ogies. There are introductory descriptions of a 
few current and common terminologies. New 
terminologies appear annually, and existing 
proprietary terminologies often become pub- 
licly available. When reviewing the following 
descriptions, try to keep in mind the back- 
ground motivation for a development effort. 
All these standards are evolving rapidly, and 
one should consult the Web sites or other pri- 
mary sources for the most recent information. 


International Classification 

of Diseases and Its Clinical 
Modifications 

One of the most recognized terminologies is 
the International Classification of Diseases 
(ICD). First published in 1893, it has been 
revised at roughly 10-year intervals, first by 
the Statistical International Institute and later 
by the World Health Organization (WHO). 
The Ninth Edition (ICD-9) was published in 
1977 (World Health Organization 1977) and 
the Tenth Edition (ICD-10) in 1992 (World 
Health Organization 1992). The ICD-9 cod- 
ing system consists of a core classification 
of three-digit codes that are the minimum 
required for reporting mortality statistics 
to WHO. A fourth digit (in the first decimal 
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place) provides an additional level of detail; 
usually .0 to .7 are used for more specific forms 
of the core term, .8 is usually “other,” and .9 
is “unspecified.” Codes in the ICD-10 coding 
system start with an alpha character and con- 
sist of three to seven characters. In both sys- 
tems, terms are arranged in a strict hierarchy, 
based on the digits in the code. In addition to 
diseases, ICD includes several “families” of 
terms for medical-specialty diagnoses, health 
status, disease-related events, procedures, and 
reasons for contact with health care providers. 

In June of 2018, the World Health 
Organization (WHO) replaced ICD-10 with 
ICD-11, with the intention to make a more 
comprehensive and computable version that 
could be maintained without the need for 
major version changes in the future. ICD-11 
includes many new features, such as a semantic 
network, a polyhierarchy, a formal information 
model, and a set of "linearization" that con- 
strain the terms in strict hierarchies, designed to 
support various functions (Chute (2018)). The 
US National Committee for Vital and Health 
Statistics (NCVHS) is currently considering the 
timing and method for transition to ICD-11, so 
close on the 2015 adoption of ICD-10. 


7.4.4.2 Current Procedural 
Terminology 

The American Medical Association developed 
the Current Procedural Terminology (CPT) in 
1966 (American Medical Association, updated 
annually) to provide a pre coordinated cod- 
ing scheme for diagnostic and therapeutic 
procedures that has since been adopted in the 
United States for billing and reimbursement. 
Like the DRG codes, CPT codes specify infor- 
mation that differentiates the codes based on 
cost. For example, there are different codes for 
pacemaker insertions, depending on whether 
the leads are “epicardial, by thoracotomy” 
(33200), “epicardial, by xiphoid approach” 
(33201), “transvenous, atrial” (33206), “trans- 
venous, ventricular” (33207), or “transvenous, 
atrioventricular (AV) sequential” (33208). CPT 
also provides information about the reasons 
for a procedure. For example, there are codes 
for arterial punctures for “withdrawal of blood 
for diagnosis” (36600), “monitoring” (36620), 
“infusion therapy” (36640), and “occlusion 


221 


therapy” (75894). Although limited in scope 
and depth (despite containing over 8000 terms), 
CPT-4 is the most widely accepted nomencla- 
ture in the United States for reporting physi- 
cian procedures and services for federal and 
private insurance third-party reimbursement. 


7.4.4.3 Diagnostic and Statistical 
Manual of Mental Disorders 

The American Psychiatric Association pub- 
lished the Fifth Edition of the Diagnostic 
and Statistical Manual of Mental Disorders 
(DSM-5) in May 2013 (American Psychiatric 
Association Committee on Nomenclature 
and Statistics (2013)). DSM-5 is the standard 
classification of mental disorders used by 
mental health professionals in the U.S. and 
contains a listing of diagnostic criteria for 
every psychiatric disorder recognized by the 
U.S. healthcare system.’ The previous edition, 
DSM-IV, was originally published in 1994 
and revised in 2000 as DSM-IV-TR. DSM-5 
is coordinated with ICD-10. 


7.4.4.4 SNOMED Clinical Terms 
and Its Predecessors 

Drawing from the New York Academy 
of Medicine’s Standard Nomenclature of 
Diseases and Operations (SNDO) (Plunkett 
1952; Thompson and Hayden 1961; New York 
Academy of Medicine 1961), the College 
of American Pathologists (CAP) developed 
the Standard Nomenclature of Pathology 
(SNOP) as a multiaxial system for describ- 
ing pathologic findings (College of American 
Pathologists 1971) through post-coordination 
of topographic (anatomic), morphologic, 
etiologic, and functional terms. SNOP has 
been used widely in pathology systems in the 
United States; its successor, the Systematized 
Nomenclature of Medicine (SNOMED) has 
evolved beyond an abstracting scheme to 
become a comprehensive coding system. 

Largely the work of Roger Côté and David 
Rothwell, SNOMED was first published in 
1975, was revised as SNOMED II in 1979, 
and then greatly expanded in 1993 as the 


9 > https://www.psychiatry.org/psychiatrists/prac- 
tice/dsm (last accessed 12/2/2019) 
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Systematized Nomenclature of Human and 
Veterinary Medicine—SNOMED. 

International (Côté and Rothwell 1993). 
Each of these versions was multi-axial; cod- 
ing of patient information was accomplished 
through the post-coordination of terms from 
multiple axes to represent complex terms that 
did not exist as single codes in SNOMED. In 
1996, SNOMED changed from a multi-axial 
structure to a more logic-based structure 
called a Reference Terminology (Campbell 
et al. 1998; Spackman et al. 1997a, 1997b), 
intended to support more sophisticated data 
encoding processes and resolve some of the 
problems with earlier versions of SNOMED 
(see © Fig. 7.2). In 1999, CAP and the NHS 
announced an agreement to merge their 
products into a single terminology called 
SNOMED Clinical Terms (SNOMED-CT) 
(Spackman 2000), containing terms for 
over 344,000 concepts (see @ Fig. 7.3). 
SNOMED-CT is currently maintained by 
a not-for-profit association once called the 
International Health Terminology Standards 
Development Organization (IHTSDO), but 
now simply SNOMED International. 

Despite the broad coverage of 
SNOMED-CT, it continues to allow users to 
create new, ad hoc terms through post-coordi- 
nation of existing terms. While this increases 
the expressivity, users must be careful not to 
be too expressive because there are few rules 
about how the post-coordination coding 
should be done, the same expression might 
end up being represented differently by dif- 
ferent coders. For example, “acute appendici- 
tis” can be coded as a single disease term, as 
a combination of a modifier (“acute”) and a 
disease term (“appendicitis”), or as a combi- 
nation of a modifier (“acute”), a morphology 
term (“inflammation”) and a topography term 
(“vermiform appendix”). Users must there- 
fore be careful when post-coordinating terms, 
not to recreate a meaning that is satisfied by 
an already existing single code. SNOMED- 
CT’s description logic, such as the example in 
O Fig. 7.2, can help guide users when select- 
ing modifiers. 


Concept: Bacterial pneumonia 
Concept Status Current 
fully defined by ... 
Isa 
Infectious disease of lung 
Inflammatory disorder of lower respiratory tract 
Infective pneumonia 
Inflammation of specifie body organs 
Inflammation of specifie body systems 
Bacterial infectious disease 
Causative agent: 
Bacterium 
Pathological process: 
Infections disease 
Associated morphology: 
Inflammation 
Finding site: 
Lung structure 
Onset: 
Subacute onset 
Acute onset 
Insidious onset 
Sudden onset 
Severity: 
Severities 
Episodicity: 
Episodicities 
Course: 
Courses 
Descriptions: 
Bacterial pneumonia (disorder) 
Bacterial pneumonia 
Legacy codes: 
SNOMED: DE-10100 
CTV31D: X100H 


O Fig. 7.3 Description-logic representation of the 
SNOMED-CT term “Bacterial Pneumonia.” The “Is a” 
attributes define bacterial pneumonia’s position in 
SNOMED-CT’s multiple hierarchy, while attributes 
such as “Causative Agent” and “Finding Site” provide 
definitional information. Other attributes such as 
“Onset” and “Severities” indicate ways in which bacte- 
rial pneumonia can be postcoordinated with other 
terms, such as “Acute Onset” or any of the descendants 
of the term “Severities.” “Descriptions” refers to various 
text strings that serve as names for the term, while “Leg- 
acy Codes” provide back-ward compatibility to 
SNOMED and Read Clinical Terms (NHS Centre for 
Coding and Classification (1994)) 


7.4.4.5 GALEN 


In Europe, a consortium of universities, agen- 
cies, and vendors, with funding from the 
Advanced Informatics in Medicine initia- 
tive (AIM), has formed the GALEN project 
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to develop standards for representing coded 
patient information (Rector et al. 1995). 
GALEN developed a reference model for 
medical concepts using a formalism called 
Structured Meta Knowledge (SMK) and 
a formal representation language called 
GALEN Representation and Integration 
Language (GRAIL). Using GRAIL, terms 
are defined through relationships to other 
terms, and grammars are provided to allow 
combinations of terms into sensible phrases. 
The reference model is intended to allow 
representation of patient information in 
a way that is independent of the language 
being recorded and of the data model used 
by an electronic medical record system. The 
GALEN developers are working closely with 
CEN TC 251 (see > Sect. 8.3.2) to develop 
the content that will populate the reference 
model with actual terms. 


7.4.4.6 Logical Observations, 

Identifiers, Names, and Codes 
An independent consortium, led by Clement 
J. McDonald and Stanley M. Huff, has cre- 
ated a naming system for tests and obser- 
vations. The system is called Logical 
Observation Identifiers Names and Codes 
(LOINC).!° The coding system contains 
names and codes for laboratory tests, patient 
measurements, assessment instruments, 
document and section names, and radiol- 
ogy exams. © Figure 7.4 shows some typical 
fully specified names for common laboratory 
tests. The standard specifies structured coded 
semantic information about each test, such 
as the substance measured and the analytical 
method used. 


7.4.4.7 Nursing Terminologies 

Nursing organizations and research teams 
have been extremely active in the develop- 
ment of standard coding systems for docu- 
menting and evaluating nursing care. One 
review counted a total of 12 separate proj- 
ects active worldwide (Coenen et al. 2001), 


10 5» loinc.org (accessed 5/30/19) 
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including coordination with SNOMED and 
LOINC. These projects have arisen because 
general medical terminologies fail to represent 
the kind of clinical concepts needed in nurs- 
ing care. For example, the kinds of problems 
that appear in a physician’s problem list (such 
as “myocardial infarction” and “diabetes mel- 
litus”) are relatively well represented in many 
of the terminologies that we have described, 
but the kinds of problems that appear in a 
nurse’s assessment (such as “activity intol- 
erance” and “knowledge deficit related to 
myocardial infarction”) are not. Preeminent 
nursing terminologies include the North 
American Nursing Diagnosis Association 
(NANDA) codes, the Nursing Interventions 
Classification (NIC), the Nursing Outcomes 
Classification (NOC), the Georgetown Home 
Health Care Classification (HHCC), and the 
Omaha System (which covers problems, inter- 
ventions, and outcomes). 

Despite the proliferation of standards 
for nursing terminologies, gaps remain in 
the coverage of this domain (Park and Cho 
2009). The International Council of Nurses 
and the International Medical Informatics 
Association Nursing Informatics Special 
Interest Group have worked together to 
produce the International Classification for 
Nursing Practice (ICNP®). This system uses 
a post-coordinated approach for describing 
nursing diagnoses, actions, and outcomes. 


7.4.4.8 Drug Codes 

A variety of public and commercial terminol- 
ogies have been developed to represent terms 
used for prescribing, dispensing and adminis- 
tering drugs. The WHO Drug Dictionary is an 
international classification of drugs that pro- 
vides proprietary drug names used in differ- 
ent countries, as well as all active ingredients 
and the chemical substances, with Chemical 
Abstract numbers. Drugs are classified accord- 
ing to the Anatomical-Therapeutic-Chemical 
(ATC) classification, with cross-references to 
manufacturers and reference sources. The cur- 
rent dictionary contains 25,000 proprietary 
drug names, 15,000 single ingredient drugs, 
10,000 multiple ingredient drugs, and 7000 
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O Fig.7.4 Examples of codes 
in SNOMED-CT, showing some 
of the hierarchical relationships 
among bacterial pneumonia 
terms. Tuberculosis terms and 
certain terms that are included in 
SNOMED.-CT for compatibility 
with other terminologies are not 
shown. Note that some terms 
such as “Congenital group A 
hemolytic streptococcal pneumo- 
nia” appear under multiple 
parent terms, while other terms, 
such as “Congenital 
staphylococcal pneumonia” are 
not listed under all possible 
parent terms (e.g., it is under 
“Congenital pneumonia” but not 
under “Staphylococcal 
pneumonia”). Some terms, such 
as “Pneumonic plague” and 
“Mycoplasma pneumonia” are 
not classified under Bacterial 
Pneumonia, althogh the 
causative agents in their 
descriptions (“Yersinia pestis” 
and “Myocplasma 
pneumioniae”, respectively) are 
classified under “Bacterium”, the 
causative agent of Bacterial 
pneumonia 


Pneumonia 


Bacterial pneumonia 
Proteus pneumonia 
Legionella pneumonia 
Anthrax pneumonia 
Actinomycotic pneumonia 
Nocardial pneumonia 
Meningocoocal pneumonia 
Chlamydial pneumonia 
Neonatal chlamydial pneumonia 
Ornithosis 
Ornithosis with complication 
Ornithosis with pneumonia 
Congenital bacterial pneumonia 
Congenital staphylococcal pneumonia 
Congenital group A hemolytic streptocoocal pneumonia 
Congenital group B hemolytic streptocoocal pneumonia 
Congenital Escherichia colt pneumonia 
Congenital pseudomonal pneumonia 
Chlamydial pneumonitis in all species except pig 
Feline pneumonitis 
Staphylocoocal pneumonia 
Pulmonary actinobacillosis 
Pneumonia in Q fever 
Pneumonia due to Streptococcus 
Group B streptococcal pneumonia 
Congenital group A hemolytic streptococcal pneumonia 
Congenital group B hemolytic streptococcal pneumonia 
Pneumococcal pneumonia 
Pneumococcal lobar pneumonia 
AIDS with pneumococcal pneumonia 
Pneumonia due to Pseudomonas 
Congenital pseudomonal pneumonia 
Pulmonary tularemia 
Enzootic pneumonia of calves 
Pneumonia in pertussis 
AIDS with bacterial pneumonia 
Enzootic pneumonia of sheep 
Pneumonia due to Klebstella pneumontae 
Hemophilus influenzae pneumonia 
Porcine contagious pleuropneumonia 
Pneumonia due to pleuropneumonia-like organism 
Secondary bacterial pneumonia 
Pneumonic plague 
Primary pneumonic plague 
Secondary pneumonic plague 
Salmonella pneumonia 
Pneumonia in typhoid fever 
Infective pneumonia 
Mycoplasma pneumonia 
Enzootic mycoplasmal pneumonia of swine 
Achromobacter pneumonia 
Bovine pneumonic pasteurellosis 
Corynebacterial pneumonia of foals 
Pneumonia due to Escherichia coli 
Pneumonia due to Proteus mirabilis 
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chemical substances. The dictionary now cov- 
ers drugs from 34 countries and grows at a 
rate of about 2000 new entries per year. 

The National Drug Codes (NDC), 
produced by the U.S. Food and Drug 
Administration (FDA), is applied to all drug 
packages. It is widely used in the United 
States, but it is not as comprehensive as the 
WHO codes. The FDA designates part of the 
code based on drug manufacturer, and each 
manufacturer defines the specific codes for 
their own products. As a result, there is no 
uniform class hierarchy for the codes, and 
codes may be reused at the manufacturer’s dis- 
cretion. Due in part to the inadequacies of the 
NDC codes, pharmacy information systems 
typically purchase proprietary terminologies 
from knowledge base vendors. These termi- 
nologies map to NDC, but provide additional 
information about therapeutic classes, aller- 
gies, ingredients, and forms. 

The need for standards for drug termi- 
nologies led to a collaboration between the 
FDA, the U.S. National Library of Medicine 
(NLM), the Veterans Administration (VA), 
and the pharmacy knowledge base vendors 
that has produced a representational model 
for drug terms called RxNorm. The NLM 
provides RxNorm to the public as part of the 
Unified Medical Language System (UMLS) 
(see below) to support mapping between NDC 
codes, the VA’s National Drug File (VANDF) 
and various proprietary drug terminologies 
(Nelson et al. 2002). RxNorm currently con- 
tains 14,000 terms. 


7.4.4.9 Medical Subject Headings 

The Medical Subject Headings (MeSH), 
maintained by the NLM (updated annu- 
ally), is the terminology by which the world 
medical literature is indexed. MeSH arranges 
terms in a structure that breaks from the strict 
hierarchy used by most other coding schemes. 
Terms are organized into hierarchies and may 
appear in multiple places in the hierarchy 
(0 Fig. 7.5). Although it is not generally used 
as a direct coding scheme for patient informa- 
tion, it plays a central role in the UMLS. 
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7.4.4.10 RadLex 


RadLex is a terminology produced by the 
Radiology Society of North America (RSNA). 
With more than 30,000 terms, RadLex is 
intended to be a unified language of radiology 
terms for standardized indexing and retrieval 
of radiology information resources. RadLex 
includes the names of anatomic parts, radi- 
ology devices, imaging exams and procedure 
steps performed in radiology. Given the scope 
of the radiology domain, many RadLex terms 
overlap with SNOMED-CT, and LOINC. 


Bioinformatics 
Terminologies 

For the most part, the terminologies dis- 
cussed above fail to represent the levels of 
detail needed by biomolecular researchers. 
This has become a more acute problem with 
the advent of bioinformatics and the sequenc- 
ing of organism genomes (see ® Chap. 11). 
As in other domains, researchers have been 
forced to develop their own terminologies. 
As these researchers have begun to exchange 
information, they have recognized the need 
for standard naming conventions as well as 
standard ways of representing their data with 
terminologies. Prominent efforts to unify 
naming systems include the Gene Ontology 
(GO) (Harris et al. 2004) from the Gene 
Ontology Consortium and the gene naming 
database of the HUGO Gene Nomenclature 
Committee (HGNC). A related resource is 
the RefSeq database of the National Center 
for Biotechnology Information (NCBI) which 
contains identifiers for reference sequences. 


7.4.4.11 


7.4.4.12 Unified Medical Language 
System 

In 1986, Donald Lindberg and Betsy 
Humphreys, at the NLM, began working with 
several academic centers to identify ways to 
construct a resource that would bring together 
and disseminate controlled medical terminol- 
ogies. An experimental version of the UMLS 
was first published in 1989 (Humphreys 
1990); the UMLS has been updated annu- 
ally since then. Its principal component is the 
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O Fig.7.5 Examples of 
common laboratory test terms as 
they are encoded in LOINC. The 
major components of the fully 
specified name are in separate 
columns and consist of the 
analyte, the property (e.g., Menc 
mass concentration, Scnc 
substance concentration, Acnc 
arbitrary concentration, Vfr 
volume fraction, EntMass entitic 
mass, EntVol entitic volume, Vel 
velocity, and Nenc number 
concentration), the timing (Pt 
point in time), the system 
(specimen), and the method 
(Ord ordinal, Qn quantitative) 


Blood glucose 
Plasma glucose 
Serum glucose 
Urine glucose concentration 
Urine glucose by dip slick 
Glucose tolerance test at 
2 hours 
lonized whole blood calcium 


Serum or plasma 
ionized calcium 


24-hour calcium excretion 
Whole blood total calcium 


Serum or plasma total 
calcium 


Automated hematocrit 
Manual spun hematocrit 
Urine erythrocyte casts 


GLUCOSE:MCNC:PT:BLD:QN: 
GLUCOSE:MCNC:PT:PLAS:QN: 
GLUCOSE:MCNC:PT:SER:QN: 
GLUCOSE:MCNC:PT:UR:QN: 
GLUCOSE:MCNC:PT:UR:SQ:TEST STRIP 
GLUCOSE.2H POST 100 G GLUCOSE PO: 
MCNC:PT:PLAS:QN: 
CALCIUM.FREE:SCNC:PT:BLD:QN: 
CALCIUM.FREE:SCNC:PT:SER/PLAS:QN: 


CALCIUM.TOTAL:MRAT:24H:UR:QN: 
CALCIUM.TOTAL:SCNC:PT:BLD:QN: 
CALCIUM.TOTAL:SCNC:PT:SER/PLAS:QN: 


HEMATOCRIT:NFR:PT:BLD:QN: AUTOMATED COUNT 
HEMATOCRIT:NFR:PT:BLD:QN:SPUN 
ERYTHROCYTE CASTS:ACNC:PT:URNS:SQ: 


Erythrocyte MCHC 


Erythrocyte MCH 


Erythrocyte MCV 


Automated Blood RBC 


Manual blood RBC 


ESR by Westergren method 


ESR by Wintrobe method 


Metathesaurus, which contains over 8.9 mil- 
lion terms collected from over 160 different 
sources (including many of those that we have 
discussed), and attempts to relate synony- 
mous and similar terms from across the dif- 
ferent sources into over 2.6 million concepts 
(O Fig. 7.6). B Figure 7.7 lists the preferred 
names for many of the pneumonia concepts 
in the Metathesaurus; @ Fig. 7.8 shows how 
like terms are grouped into concepts and are 
tied to other concepts through semantic rela- 
tionships. @ Figure 7.9 shows some of the 
information available in the Unified Medical 
Language System about selected pneumonia 
concepts. 


MICROSCOPY.LIGHT 


ERYTHROCYTE MEAN CORPUSCULAR HEMOGLOBIN 
CONCENTRATION:MCNC:PT:RBC:QN:AUTOMATED 
COUNT 


ERYTHROCYTE MEAN CORPUSCULAR 


HEMOGLOBIN:MCNC:PT:RBC:QN: AUTOMATED 
COUNT 


ERYTHROCYTE MEAN CORPUSCULAR 
VOLUME:ENTVOL:PT:RBC:ON:AUTOMATED COUNT 


ERYTHROCYTES:NCNC:PT:BLD:QN: AUTOMATED 
COUNT 


ERYTHROCYTES:NCNC:PT:BLD:QN: MANUAL 
COUNT 


ERYTHROCYTE SEDIMENTATION 
RATE:VEL:PT:BLD:QN:WESTERGREN 

ERYTHROCYTE SEDIMENTATION 
RATE:VEL:PT:BLD:QN:WINTROBE 


7.5 Data Interchange Standards 
The recognition of the need to interconnect 
health care applications led to the develop- 
ment and enforcement of data interchange 
standards. The conceptualization stage began 
in 1980 with discussions among individu- 
als in an organization called the American 
Association for Medical Systems and 
Informatics (AAMSI). In 1983, an AAMSI 
task force was established to pursue those 
interests in developing standards. 

The development phase was multifaceted. 
The AAMSI task force became subcommittee 
E31.11 of the ASTM and developed and pub- 
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Respiratory Tract Diseases 
Lung Diseases 
Pneumonia 
Bronchopneumonia 
Pneumonia, Aspiration 
Pneumonia, Lipid 
Pneumonia, Lobar 
Pneumonia, Mycoplasma 
Pneumonia, Pneumocystis carinii 
Pneumonia, Rickettsial 
Pneumonia, Staphylococcal 
Pneumonia, Viral 
Lung Diseases, Fungal 
Pneumonia, Pneumocystis carinii 
Respiratory Tract Infections 
Pneumonia 
Pneumonia, Lobar 
Pneumonia, Mycoplasma 
Pneumonia, Pneumocystis carinii 
Pneumonia, Rickettsial 
Pneumonia, Staphylococcal 
Pneumonia, Viral 
Lung Diseases, Fungal 
Pneumonia, Pneumocystis carinii 


O Fig. 7.6 Partial tree structure for the Medical Sub- 
ject Headings showing pneumonia terms. Note that 
terms can appear in multiple locations, although they 
may not always have the same children, implying that 
they have somewhat different meanings in different con- 
texts. For example, Pneumonia means “lung inflamma- 
tion” in one context (line 3) and “lung infection” in 
another (line 16) 


O Fig. 7.7 Growth of the UMLS. The UMLS Metath- 
esaurus contains 3.85 million concepts and 14.6 million 
unique concept names from 210 source vocabularies. 
The content continues to grow dynamically in response 
to user needs (Source: U.S. National Library of 
Medicine) 
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lished ASTM standard 1238 for the exchange 
of clinical-laboratory data. Two other groups 
were formed to develop standards, each with a 
slightly different emphasis: HL7 and Institute 
of Electrical and Electronics Engineering 
(IEEE) Medical Data Interchange (“Medix”) 
Standard. The American College of Radiology 
(ACR) joined with the National Electronic 
Manufacturers Association (NEMA) to 
develop a standard for the transfer of image 
data. Two other groups developed related 
standards independent of the biomedical 
informatics community: (1) ANSI X12 for 
the transmission of commonly used business 
transactions, including health care claims and 
benefit data, and (2) National Council for 
Prescription Drug Programs (NCPDP) for 
the transmission of third-party drug claims. 


7.5.1 General Concepts 
and Requirements 


The purpose of a data-interchange standard 
is to permit one system, the sender, to trans- 
mit to another system, the receiver, all the 
data required to accomplish a specific com- 
munication, or transaction set, in a precise, 
unambiguous fashion. To complete this task 
successfully, both systems must know what 
format and content is being sent and must 
understand the words or terminology, as well 
as the delivery mode. 

A communications model, called the 
Open Systems Interconnection (OSI) refer- 
ence model (ISO 7498-1), has been defined 
by the ISO (see > Chap. 5 and the discussion 
of software for network communications). It 
describes seven levels of requirements or spec- 
ifications for a communications exchange: 
physical, data link, network, transport, ses- 
sion, presentation, and application (Rose 
1989; Stallings 1987; Tanenbaum 1987). Level 
7, the application level, deals primarily with 
the semantics or data-content specification 
of the transaction set or message. For the 
data-interchange standard, HL7 requires the 
definition of all the data elements to be sent in 
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O Fig.7.8 Some of the 
bacterial pneumonia concepts 
in the Unified Medical 
Language System 
Metathesaurus 


C0004626: Pneumonia, Bacterial 

C0023241: Legionnaires’ Disease 

C0032286: Pneumonia due to other specified bacteria 
C0032308: Pneumonia, Staphylococcal 

C0152489: Salmonella pneumonia 


C0155858: Other bacterial pneumonia 

C0155859: Pneumonia due to Klebsiella pneumoniae 

C0155860: Pneumonia due to Pseudomonas 

C0155862: Pneumonia due to Streptococcus 

C0155865: Pneumonia in pertussis 

C0155866: Pneumonia in anthrax 

C0238380: PNEUMONIA, KLEBSIELLA AND OTHER GRAM NEGATIVE BACILLI 


C0238381: 


PNEUMONIA, TULAREMIC 


C0242056: PNEUMONIA, CLASSIC PNEUMOCOCCAL LOBAR 
C0242057: PNEUMONIA, FRIEDLAENDER BACILLUS 
C0275977: Pneumonia in typhoid fever 

C0276026: Hemophilus influenzae pneumonia 
C0276039: Pittsburgh pneumonia 

C0276071: Achromobacter pneumonia 

C0276080: Pneumonia due to Proteus mirabilis 

C0276089: Pneumonia due to Escherichia coli 

C0276523: AIDS with bacterial pneumonia 

C0276524: AIDS with pneumococcal pneumonia 
C0339946: Pneumonia with tularemia 

C0339947: Pneumonia with anthrax 

C0339952: Secondary bacterial pneumonia 

C0339953: Pneumonia due to Escherichia coli 

C0339954: Pneumonia due to proteus 

C0339956: Typhoid pneumonia 

C0339957: Meningococcal pneumonia 

C0343320: Congenital pneumonia due to staphylococcus 


C0343321 


: Congenital pneumonia due to group A hemolytic streptococcus 


C0343322: Congenital pneumonia due to group B hemolytic streptococcus 
C0343323: Congenital pneumonia due to Escherichia coli 

C0343324: Congenital pneumonia due to pseudomonas 

C0348678: Pneumonia due to other aerobic Gram-negative bacteria 
C0348680: Pneumonia in bacterial diseases classified elsewhere 


C0348801: 


Pneumonia due to streptococcus, group B 


C0349495: Congenital bacterial pneumonia 

C0349692: Lobar (pneumococcal) pneumonia 

C0375322: Pneumococcal pneumonia {Streptococcus pneumoniae pneumonia} 
C0375323: Pneumonia due to Streptococcus, unspecified 

C0375324: Pneumonia due to Streptococcus Group A 

C0375326: Pneumonia due to other Streptococcus 

C0375327: Pneumonia due to anaerobes 

C0375328: Pneumonia due to Escherichia coli 

C0375329: Pneumonia due to other Gram-negative bacteria 

C0375330: Bacterial pneumonia, unspecified 


response to a specific task, such as the admis- 
sion of a patient to a hospital. In many cases, 
the data content requires a specific terminol- 
ogy that can be understood by both sender 
and receiver. 

Presentation, the sixth level of 
Interoperability, addresses the syntax of 
the message, or how the data are formatted. 
There are both similarities and differences at 


this level across the various standards bodies. 
Two philosophies are used for defining syntax: 
one proposes a position-dependent format; 
the other uses a tagged-field format. In the 
position-dependent format, the data content is 
specified and defined by position. 

The remaining OSI levels—session, trans- 
port, network, data link, and physical—gov- 
ern the communications and networking 
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O Fig. 7.9 Some of the 
information available in the 
Unified Medical Language 
System about selected 
pneumonia concepts. Concept’s 
preferred names are shown in 
italics. Sources are identifiers for 
the concept in other 
terminologies. Synonyms are 
names other than the preferred 
name. ATX is an associated 
Medical Subject Heading 
expression that can be used for 
Medline searches. The remaining 
fields (Parent, Child, Broader, 
Narrower, Other, and Semantic) 
show relationships among 
concepts in the Metathesaurus. 
Note that concepts may or may 
not have hierarchical relations 
to each other through 
Parent-Child, Broader— 
Narrower, and Semantic 

(is-a and inverse is-a) relations. 
Note also that Pneumonia, 
Streptococcal and Pneumonia 
due to Streptococcus are treated 
as separate concepts, as are 
Pneumonia in Anthrax and 
Pneumonia, Anthrax 
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Bacterial pneumonia 


Source: CSP93/PT/259S.5280; DOR27/DT/U000523; 
ICD91/PT/482.9; ICD91/IT/482.9 

Parent: Bacterial Infections; Pneumonia; Influenza with Pneumonia 

Child: Pneumonia, Mycoplasma 


Narrower: Pneumonia, Lobar; Pneumonia, Rickettsial; Pneumonia, 
Staphylococcal; Pneumonia due to Klebsiella pneumoniae; 
Pneumonia due to Pseudomonas; Pneumonia due to Hemophilus 
influenzae 


Other: Klebsiella pneumoniae, Streptococcus pneumonlae 
Pneumonia, Lobar 
Source: 1CD91/IT/481 ; MSH94/PM/D011018; MSH94/MH/D011018; 


SNM2/RT/M-40000; ICD91/PT/481 ; SNM2/PT/D-0164; 
DXP92/PT/U000473; MSH94/EP/D011018; 
INS94/MH/D011018;1NS94/SY/D011018 

Synonym: Pneumonia, diplococcal 


Parent: Bacterial Infections; Influenza with Pneumonia 
Broader: Bacterial Pneumonia; Inflammation 
Other: Streptococcus pneumoniae 


Semantic: inverse-is-a: Pheumonia 
has-result: Pneumococcal Infections 


Pneumonia, Staphylococcal 


Source: ICD91/PT/482.4; ICD91/IT/482.4; MSH94/MH/D01 1023; 
MSH94/PMIDO1 1023; MSH94/EP/D011023; SNM2/PT/D-017X; 
INS94/MH/D01 1023; INS94/SY /D011023 

Parent: Bacterial Infections; Influenza with Pneumonia 

Broader: Bacterial Pheumonia 


Semantic inverse-is-a: Pneumonia; Staphylococcal Infections 


Pneumonia, Streptococcal 


Source: ICD91/IT/482.3 
Other: Streptococcus pneumoniae 
Pneumonia due to Streptococcus 
Source: ICD9Y1/PT/482.3 
ATX: Pneumonia AND Streptococcal Infections AND NOT Pneumonia, Lobar 
Parent: Inftuenza with Pneumonia 


Pneumonia in Anthrax 


Source: ICD91/PT/484.5; ICD91/IT/022.1 ; ICD91/IT/484.5 
Parent: Influenza with Pneumonia 
Broader: Pneumonia in other infectious diseases classified elsewhere 
Other: Pneumonia, Anthrax 
Pneumonia, Anthrax 
Source: ICD91/IT/022.1; ICD91/IT/484.5 
Other: Pneumonia in Anthrax 


protocols and the physical connections made 
to the system. Obviously, some understanding 
at these lower levels is necessary before a link- 
age between two systems can be successful. 
Increasingly, standards groups are defining 
scenarios and rules for using various proto- 
cols at these levels, such as TCP/IP. Much of 
the labor in making existing standards work 
lies in these lower levels. 

Typically, a transaction set or message is 
defined for a particular event, called a trig- 
ger event. This trigger event, such as a hos- 
pital admission, then initiates an exchange of 
messages. The message is composed of several 


data segments; each data segment consists of 
one or more data fields. Data fields, in turn, 
consist of data elements that may be one of 
several data types. The message must identify 
the sender and the receiver, the message num- 
ber for subsequent referral, the type of mes- 
sage, special rules or flags, and any security 
requirements. If a patient is involved, a data 
segment must identify the patient, the circum- 
stances of the encounter, and additional infor- 
mation as required. A reply from the receiving 
system to the sending system is mandatory in 
most circumstances and completes the com- 
munications set. 
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It is important to understand that the sole 
purpose of the data-interchange standard is to 
allow data to be sent from the sending system 
to the receiving system; the standard is not 
intended to constrain the application system 
that uses those data. Application indepen- 
dence permits the data-interchange standard 
to be used for a wide variety of applications. 
However, the standard must ensure that it 
accommodates all data elements required by 
the complete application set. 


7.5.2 Specific Data Interchange 
Standards 


As health care increasingly depends on 
the connectivity within an institution, an 
enterprise, an integrated delivery system, a 
geographic system, or even a national inte- 
grated system, the ability to interchange 
data in a seamless manner becomes criti- 
cally important. The economic benefits of 
data-interchange standards are immediate 
and obvious. Consequently, it is in this area 
of healthcare standards that most effort has 
been expended. All of the SDOs in health 
care have some development activity in data- 
interchange standards. 

The following sections summarize many 
of the current standards for data-interchange. 
Examples are provided to give you a sense 
of the technical issues that arise in defining 
a data-exchange standard, but details are 
beyond the scope of this effort. In fact, the 
pace of change is so great that many of the 
referenced standards will have been improved 
at the time of publication. Rather than pro- 
viding an exhaustive list of standards, links 
to the standards and standards platforms will 
provide access to the most recent technical 
information and its implementation. 


7.5.2.1 HL7 Standards 

HL7 has provided standards that have been 
adopted world-wide. In the United States and 
in many other countries, these standards are 
codified in legislation and in regulation. The 
changes in these standards most often reflect 


new governmental policies, new science (both 
clinical and pre-clinical), new care paradigms, 
and new models for payment models. 


Compendium of HL7 Standards 
= Introduction to HL7 Standards 
HL7 provides a framework, as well as related 
standards, for the exchange, integration, 
sharing, and retrieval of electronic health 
information. These standards define how 
information is packaged and communicated 
from one party to another, setting the lan- 
guage, structure and data types required for 
seamless integration between systems. HL7 
standards support clinical practice and the 
management, delivery, and evaluation of 
health services, and are recognized as the most 
commonly used in the world. 

> http://www.hl7.org/implement/stan- 
dards/index.cfm?ref=nav 


= HL7 Primary Standards 
Primary standards are the most widely 
implemented standards and are fundamen- 
tal for system integrations, inter-operability 
and compliance. The most frequently used 
standards are defined in this category 

> http://www.hl7.org/implement/ 
standards/product_section.cfm?section 
=| &ref=nav 


= HL7 Foundational Standards 
Foundational standards define the fundamen- 
tal tools and building blocks used to create the 
standards, and the technology infrastructure 
that implementers of HL7 standards must 
manage. 

> http://www.hl7.org/implement/ 
standards/product_section.cfm?section 
=2&ref=nav 


= HL7 Clinical & Administrative Domains 
Messaging and document standards for clini- 
cal specialties and groups are found in this 
section. These standards are usually imple- 
mented once primary standards for the orga- 
nization are operational. 

> http://www.hl7.org/implement/stan- 
dards/product_section.cfm?section=3 
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= HL7 EHR Profiles 
These standards provide functional models 
and profiles that enable the constructs for 
management of electronic health records. 
EHR System Functional Model (EHR-S FM) 
outlines important features and functions that 
should be contained in an EHR system. 

> http://www.hl7.org/implement/stan- 
dards/product_section.cfm?section=4 


= HL7 Implementation Guides 
Implementation guides and their supporting 
documents are intended to be used in conjunc- 
tion with an existing standard. The support- 
ing documents serve as supplemental material 
for a parent standard. Implementation guides 
provide the road map for transforming the 
technical standard into an effective working 
solution. 

> http://www.hl7.org/implement/stan- 
dards/product_section.cfm?section=5 


= HL7 Standards Rules & References 
These references provide the technical speci- 
fications, programming structures and guide- 
lines for software and standards development. 
They are not stand alone solutions, but rather 
provide support for a standard or for a family 
of standards. 

> http://www.hl7.org/implement/stan- 
dards/product_section.cfm?section=6 


= HL7 Current Projects & Education 
This is a resource for Standards for Trial Use 
(STUs) and for ongoing projects and stan- 
dards. The link also provides helpful resources 
and tools to further supplement understand- 
ing and adoption of HL7 standards. 

> http://www.hl7.org/implement/stan- 
dards/product_section.cfm?section=7 


= HL7 Standards Master Grid 

This is a convenient navigation tool for all 
HL7 standards. Because HL7 encompasses 
the complete life cycle of a standards specifi- 
cation, including the development, adoption, 
market recognition, utilization, and adher- 
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ence. There is also an explanation of the IP 
Policy that provides more information about 
how members and non-members can use the 
standard 

> http://www.hl7.org/implement/stan- 
dards/product_matrix.cfm?ref=nav 


Clinical Document Architecture 

Since its initial development in 2001, the 
Clinical Document Architecture (CDA) 
standard has become globally adopted for a 
broad range of use (Ferranti et al. 2006). Now 
an ISO standard and advanced to Release 
2, CDA is a document markup standard for 
the structure and semantics of an exchanged 
“clinical document.” CDA is built upon the 
RIM and relies upon reusable templates for 
its ease of implementation. A CDA document 
is a defined and complete information object 
that can exist outside of a message and can 
include text, images, sounds, and other mul- 
timedia content. CDA supports the following 
features: persistence, stewardship, potential 
for authentication, context, wholeness, and 
human-readability. In the US, CDA is one 
of the core components of data exchange for 
Meaningful Use. The competing implementa- 
tion processes for CCD profile development 
were successfully harmonized into a broadly 
adopted Consolidated Continuity of Care 
Document (CCCD). 

In order to ease the path to implementa- 
tion of CDA, HL7 has developed a more nar- 
rowly defined specification called greenCDA, 
which limits the requirements of the RIM, 
provides greater ease of template composi- 
tion, and consumes much less bandwidth for 
transmission. An additional effort to promote 
CDA adoption was achieved with the release 
of the CDA Trifolia repository, which, in 
addition to offering a library of templates, 
includes tooling for template modification as 
well as a template-authoring language. This 
has enabled the adoption of native CDA for 
exchange of laboratory data, clinical summa- 
ries, and electronic prescriptions and well as 
for clinical decision support. 
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HL7 Fast Healthcare Interoperability 
Resources 

FHIR (see > Sect. 7.2.3) is a new highly inno- 
vative approach to standards development, 
first introduced by HL7 in 2011. FHIR was 
created in order to overcome the complexity of 
development based upon the HL7 Reference 
Information Model (RIM), without losing the 
successful interoperability that model-driven 
data interchange demands. At the same time, 
FHIR delivers greater ease of implementation 
than other high-level development processes. 
It is designed to be compatible with legacy 
systems that conform to V2 and/or V3 mes- 
saging, and it supports system-development 
utilizing broadly deployed Clinical Document 
Architecture (CDA) platforms and ubiqui- 
tous templated CDA implementations, such 
as Consolidated CDA. 

Although FHIR is built upon more than 
a decade of the development and refinement 
of the RIM, FHIR utilizes unique meth- 
odologies, artifacts, tooling, and publishing 
approach. While FHIR is based upon the 
RIM, it does not require implementers to 
know the RIM or know the modeling lan- 
guage upon which it was built. FHIR defines 
a limited set of data models (or resources) as 
XML or JSON objects, but provides exten- 
sion mechanisms for creating any elements 
which are incomplete or missing. The result- 
ing structures are native XML/JSON objects 
which do not require knowledge of the RIM 
abstraction in order to be implemented. 
Fundamentally, each clinical concept is cre- 
ated as a single resource, which need not 
change over time. The resources remain as 
the smallest unit of abstraction, and the cre- 
ation of each resource is based upon RESTful 
design principles. Base FHIR resources can be 
further refined by creating profiles which con- 
strain existing data elements or add elements 
as extensions. 

Inherently, development can precede 
around a services (SOA) model, which will 
support cloud-based applications. While 
a RESTful framework is enabled, it is not 
required. In addition, a well-defined ontol- 
ogy persists in the background, but knowl- 
edge of the terminology is not necessary for 


implementation. Fundamental to FHIR, 
all resources, as well as all resource attri- 
butes have a free-text expression, an encoded 
expression or both. Thus, FHIR supports a 
human-readable format, which is so valuable 
to the implementations supported by CDA. 

Finally, FHIR is built with new data types, 
conformant with the familiar ISO 21090 for- 
mat. As such, these data types are far simpler 
to use, with much of the complexity captured 
in the extensions. This allows mapping to 
other models, including those developed using 
archetypes, upon which the CEN format for 
electronic medical records is predicated. This 
allows an inherently much smaller library of 
resources, all mapped to the HL7 RIM, and 
which can be maintained in perpetuity. FHIR 
developers have estimated that fewer than 150 
such resources will define all of health care. 
Other concepts can be described as extensions. 

This provides a unique opportunity for 
creation of both new applications in mature 
computing environments and for low and 
medium resource countries without legacy 
implementations. Nonetheless, migrations 
from V2 or V3 environments to FHIR imple- 
mentations are achievable through native 
tooling. 

FHIR APIs and resources can be imple- 
mented in SMART applications and thereby 
extending the utility of EHR data to support 
externalized clinical decision support, data 
visualization and combining EHR data with 
remote monitoring devices. SMART Health 
IT is an open, standards based technology 
platform that enables innovators to create apps 
that seamlessly and securely run across the 
healthcare system (> https://smarthealthit. 
org/) SMART was created at the Boston 
Children’s Hospital through a grant from 
the Office of the National Coordinator for 
Healthcare IT. SMART on FHIR enables 
cross-platform and intersystem exchange of 
data by enabling ISO standards-based solu- 
tions for security and authentication. 

With Clinical Decision Support Hooks 
(CDS-Hooks; Spineth et al. 2018), triggers can 
be built into the EHR workflow and trigger 
external CDS services. One example applica- 
tion developed with support from the Centers 


Standards in Biomedical Informatics 


for Disease Control is an opioid medication 
management tool which can be automatically 
launched when a physician orders an opioid 
medication. This tool provides guidance to 
the provider based on the patient’s history of 
opioid prescriptions as well as the current pre- 
scriber’s intended order. 

Link to CDSHooks: » https://cds-hooks. 
h17.org/ 

Most often, HL7 is recognized for its mes- 
saging standards, but there is a large contri- 
bution to technical specifications that support 
the development and implementation of these 
messaging standards. 


7.5.2.2 American Dental Association 
Standards 

In 1983, the American Dental Association 
(ADA) committee MD 156 became an ANSI- 
accredited committee responsible for all spec- 
ifications for dental materials, instruments, 
and equipment. In 1992, a Task Group of the 
ASC MD 156 was established to initiate the 
development of technical reports, guidelines, 
and standards on electronic technologies 
used in dental practice. These include digital 
radiography, digital intraoral video cameras, 
digital voice-text-image transfer, periodontal 
probing devices, and CAD/CAM. Proposed 
standards include Digital Image Capture 
in Dentistry, Infection Control in Dental 
Informatics, Digital Data Formats for 
Dentistry, Construction and Safety for Dental 
Informatics, Periodontal Probe Standard 
Interface, Computer Oral Health Record, and 
Specification for the Structure and Content of 
electronic medical record integration. 


7.5.2.3 Health Industry Business 
Communications Council 
Standards 
The Health Industry Business Communica- 
tions Council (HIBCC) has developed the 
Health Industry Bar Code (HIBC) Standard, 
composed of two parts. The HIBC Supplier 
Labeling Standard describes the data struc- 
tures and bar code symbols for bar coding of 
health care products. The HIBCC Provider 
Applications Standard describes data struc- 
tures and bar code symbols for bar coding of 
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identification data in a health care provider set- 
ting. HIBCC also issues and maintains Labeler 
Identification Codes that identify individual 
manufacturers. The HIBCC administers the 
Health Industry Number System, which pro- 
vides a unique identifier number and location 
information for every health care facility and 
provider in the United States. The HIBCC also 
administers the Universal Product Number 
Repository, which identifies specific products 
and is recognized internationally. 
Link: > https://www.hibcc.org/ 


7.5.2.4 The Electronic Data 
Interchange 
for Administration, 
Commerce, and Transport 
Standard 
The EDI for Administration, Commerce, and 
Transport (EDIFACT) is a set of interna- 
tional standards, projects, and guidelines for 
the electronic interchange of structured data 
related to trade in goods and services between 
independent computer-based information 
systems (National Council for Prescription 
Drug Programs Data Dictionary 1994). The 
standard includes application-level syntax 
rules, message design guidelines, syntax imple- 
mentation guidelines, data element dictionary, 
code list, composite data-elements dictionary, 
standard message dictionary, uniform rules of 
conduct for the interchange of trade data by 
transmission, and explanatory material. 

The basic EDIFACT (ISO 9735) syntax 
standard was formally adopted in September 
1987 and has undergone several updates. In 
addition to the common syntax, EDIFACT 
specifies standard messages (identified and 
structured sets of statements covering the 
requirements of specific transactions), seg- 
ments (the groupings of functionally related 
data elements), data elements (the smallest 
items in a message that can convey data), and 
code sets (lists of codes for data elements). 
The ANSI ASC X12 standard is similar in 
purpose to EDIFACT, and work is underway 
to coordinate and merge the two standards. 

EDIFACT is concerned not with the 
actual communications protocol but rather 
with the structuring of the data that are sent. 
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EDIFACT is independent of the machine, 
media, system, and application and can be 
used with any communications protocol or 
with physical magnetic tape. 

Link: > https://www.edistaffing.com/ 
resources/unedifact-standards/ 


7.6 Today’s Reality 


and Tomorrow’s Directions 


In the current environment, the seamless 
exchange of data that can be used for any 
purpose remains a challenge. As we move 
closer to sematic interoperability (> https:// 
en.wikipedia.org/wiki/Semantic_interoper- 
ability) for healthcare and biomedical data, 
the challenge of true plug-and-play interoper- 
ability is still elusive. The meaning of many 
concepts and terms remains ambiguous, con- 
troversial, disputed, or poorly understood. 
Instead, the standards community has built 
a system of interchange that often requires 
mapping between terminologies and stan- 
dards to overcome these issues. 


The Interface: Standards 
and Systems 


7.6.1 


Historically, interchange standards evolved 
to support sharing of information over com- 
plex networks of distributed systems. This 
served a simple business model in which data 
was pushed from disparate repositories with 
inconsistent architectures and data structures. 
This permitted the exchange of data for both 
business needs and patient care. 

In today’s medical environment, there are 
several competing forces that place a burden 
on standards requirements. The traditional 
scope of data sources included business level 
information, principally for payment needs. 
These were developed utilizing coding meth- 
odologies and business architecture that did 
not rely upon inclusion of primary clinical 
data into the reimbursement decision. With 
the advent of statutory requirements that 
demand justification of insurance claims 
and reimbursement, additional data forms 


and formats became essential. This led to the 
development of claims attachment standards 
(see X12, above) that enabled more complex 
adjudication, comparative effectiveness, and 
accountable care. These standards will most 
certainly require structured, coded data rather 
than free-text and unstructured narrative. 

Complexity of data requirements is con- 
stantly growing to better support evidence- 
based medicine, clinical decision support, 
personalized medicine, and accountable care. 
Each of these has overlapping, but fundamen- 
tally unique data streams. Moreover, the data 
provided at the point of care, if unfiltered, is 
likely to overwhelm the clinical decision mak- 
ing process. Elements of clinical data, such 
as events in pediatric years, must not com- 
pete for the attention of the caregiver. To an 
extent, this was solved with specifications, 
such as FHIRcast, which were developed to 
provide context aware data to that process. 
There are growing demands for increasing the 
depth and breadth of data delivered to that 
clinical environment. In addition, these stan- 
dards must support the implicit policy deci- 
sions about the nature of this data. 

To date, clinical and preclinical informa- 
tion populates many of the alerts that clini- 
cians receive at the point of care. Typically, 
these range from information supporting 
complex decision trees to the selection of test- 
ing and interventions. This has been abetted 
by increasing knowledge of genomic data 
and implication for therapeutic decisions. 
Although this has had its greatest impact on 
the chemotherapy of cancer, the importance 
in many other clinical domains, for more com- 
mon conditions (including the treatment of 
diabetes, hypertension, and arthritis) is now 
recognized. Current architectural systems are 
ill-prepared to manage this process. Moreover, 
data formats for genomic and genetic infor- 
mation are disparate and often incompatible. 

Data privacy requirements, and the vari- 
ability of these requirements among legal 
entities, currently pose a different set of 
demands for information access technologies. 
For example, some states permit line-item 
exclusion of clinical data that is transferred 
between providers, based on the primacy of 
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the information and the role of the caregiver. 
Other jurisdictions allow participation of 
health information exchanges to those indi- 
viduals who agree only to dissemination of 
data from complete sources. 

Existing data architectures enable a 
constant stream of data to be passed in an 
untended and unmonitored fashion. In evolv- 
ing models, data request and acknowledge- 
ment require a more complex query and 
response logic. In fact, most inquiries demand 
the validation of the provider system and 
privileges that are afforded to both the care- 
giver and the primary data repository. This 
places another component of interface design 
between the respective systems and necessi- 
tated the development of analogous provider 
indices and provider repositories. Concerns 
of both privacy and security must be met by 
these specifications. The system effectively 
asks not only who you are but why you want 
the information. 

Much of this process overhead has been 
addressed by the design and architecture of 
health information exchanges. Often the busi- 
ness case supersedes the demand for clinical 
knowledge. At the same time, these exchanges 
are designed to behave in an entirely agnos- 
tic fashion, placing no demand on either the 
sender or recipient for data quality, other than 
source identification. In fact, the metadata, so 
responsible for the value of the information, 
is often capable of specifying only its origin 
and value sets. 

In today’s clinical environment, there has 
been very little attention paid to the cap- 
ture and validation of patient-initiated data. 
While so very critical to diagnosis and ongo- 
ing management, only scant standards exist 
for embedding patient derived information 
into the clinical record without intermediate 
human interaction and adjudication. When 
allowed by current systems, data provided 
by patients often lies within the audit trail, 
as a comment, rather than in the record as 
source data. Steps are sorely needed to define 
and attribute such data since it is so critical 
to many aspects of accountable care. Data 
obtained directly from patient sources is often 
attributed to “subjective” status, but it is no 
less objective that many clinician observa- 
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tions. Perhaps justification for that lies in the 
fact that this patient derived data is neither 
quantifiable nor codeable. This is supported 
by valid concerns about the patient’s health 
care literacy, or lack thereof, but is no less 
required than validated decision support for 
caregivers. 

Data obtained from clinical research 
and clinical data provided to inform clinical 
studies suffer from other concerns of failed 
interoperability. This is attributed, and right- 
fully so, to disparities of terminologies uti- 
lized for patient care and those used in clinical 
research. This is most dramatically highlighted 
in the terminology deployed by regulation for 
adverse event reporting (MedDRA; Medical 
Dictionary of Regulatory Affairs). Mapping 
between the MedDRA dictionary and other 
clinical terminologies (SNOMED-CT, 
LOINC, ICD, and CPT) has not proven suc- 
cessful. Moreover, many aspects pertaining 
to study subject inquiries in clinical research 
are often designed to elicit yes-no responses 
(Have you smoked in the last 5 years), rather 
than data that many caregivers deem relevant. 
Yet, today, it is more critical than ever to 
enable clinical research to inform patient care 
and care derived data to enable clinical trials. 
The business model of developing drugs for 
billon dollar markets (“blockbuster drugs”) 
has proven itself to be unsustainable, as the 
cost of developing a new drug entity has now 
exceeded a billion dollars. From the clinical 
perspective, current estimates suggest that 
information from basic science research expe- 
riences delays of nearly 17 years before that 
knowledge can be incorporated into clinical 
care (Balas and Boren 2000). 

Semantic interoperability of clinical data 
inherently requires data reuse. It is not suffi- 
cient for systems to unambiguously exchange 
machine readable data. Data, once required 
only for third party payment, must be shared 
by other partners in the wellness and health 
care delivery ecosystems. Certainly, these 
data must be presented to research systems, 
as noted above. The data must also be avail- 
able for public health reporting and analy- 
sis, for comparative effectiveness research, 
for accountable care measurement and for 
enhancement of decision support systems 
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(including those for patients and their fami- 
lies). The immediate beneficiaries have been 
systems developed to support biosurveillance 
and pharmacovigilance. In practical terms, 
the business practices that govern our delivery 
systems (and the government policies and reg- 
ulations that enable them) must enable these 
data streams to both enhance care and control 
costs. 


7.6.2 Future Directions 


The new models for health care require a 
very different approach. The concept of a 
patient-centric EHR (» Chap. 19) requires 
the aggregation of any and all data created 
from, for, and about a patient into a single 
real or virtual record that provides access 
to the required data for effective care at the 
place and time of care. Health information 
exchange (HIE; see » Chap. 20) at regional, 
state, national, and potentially at a global 
level is now the goal. This goal can only be 
reached through the effective use of infor- 
mation technology, and that use can only be 
accomplished through the use of common 
global standards that are ubiquitously imple- 
mented across all sites of care. 

Three other future trends influence the 
need for new and different standards. The first 
is secondary use of data by multiple stake- 
holders. This requirement can only be met 
through semantic interoperability — a univer- 
sal ontology that covers all aspects of health, 
health care, clinical research, management, 
and evaluation. Standards for expressing 
what is to be exchanged and under what cir- 
cumstances are important as well as standards 
for the exchange of data. Included in multiple 
uses of data is reporting to other organiza- 
tions such as immunization and infectious 
disease reports to the Centers for Disease 
Control and Prevention (CDC), performance 
reports to the National Quality Forum (NQF) 
and audit reports to The Joint Commission 
(TJC). Such systems as described also enable 
population health studies and health surveil- 
lance for natural and bioterrorism outbreaks. 

The second trend area is the expansion of 
the types of data that are to be included in 


the EHR. The new emphasis on translational 
informatics will require new standards for the 
transport, inclusion into the EHR, and use of 
genetic information including genes, biomark- 
ers, and phenotypic data. Imaging, videos, 
waveforms, audio, and consumer-generated 
data will require new types of standards. 
Effective use of these new types of data as 
well as exponential increases in the volume of 
data will require standards for decision sup- 
port, standards for creating effective filters 
for presentation and data exchange, and new 
forms of presentation including visualization. 
New sources of data will include geospatial 
coding, health environmental data, social 
and community data, financial data, and cul- 
tural data. Queries and navigation of very 
large databases will require new standards. 
Establishing quality measures and trust will 
require new standards. Ensuring integrity and 
trust as data is shared and used by other than 
the source of data will require new standards 
addressing provenance and responsibility. 

The third trend area is the use of mobile 
devices, smart devices, and personal health 
devices. How, when and where such devices 
should be used is still being explored. 
Standards will be required for safe design, 
presentation, interface, integrity, and protec- 
tion from interference. 

True global interoperability will require 
a suite of standards starting with the plan- 
ning of systems, the definition and packaging 
of the data, collection of the data including 
usability standards, the exchange of data, 
the storage and use of data, and a wealth of 
applications that enable the EHR for bet- 
ter care. IT systems must turn data collected 
into information for use — a process that will 
require the use of knowledge in real time with 
data to produce information for patient care. 
Selecting the correct knowledge from litera- 
ture, clinical trials, and other forms of docu- 
mentation will require standards. Knowledge 
representation, indexing, and linkages will 
require standards. 

A major, and challenging, requirement to 
address these new types and use of data will 
be effective standards for privacy and security. 
These standards must protect, but not restrict 
the use of data and access to that data for 
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determining and giving the best care possible. 
The aggregation of data requires an error-free 
way for patient identification that will per- 
mit merger of data across disparate sources. 
Sharing data also requires standards for the 
de-identification of patient data. 

The effective management of all of these 
resources will require additional output from 
the standards communities. Standards for 
defining the required functionality of systems 
and ways for certifying adherence to required 
functionality is essential for connecting a 
seamless network of heterogeneous EHRs 
from multiple vendors. Testing of standards, 
including IHE Connectathons before wide- 
spread dissemination and perhaps mandated 
use of standards is critical to use and accep- 
tance. Standards for registries, standards for 
the rules that govern the sharing of research 
data, standards for patient consent, and stan- 
dards for identification of people, clinical 
trials, collaboration, and other similar areas 
are necessary. Profiles for use and applica- 
tion from the suite of standards are a neces- 
sity. Detailed implementation guides are key 
to use and implementation of standards. 
Tools that enable content population and use 
of standards are mandatory for easy use of 
standards. 

Standards for these new and evolving busi- 
ness and social needs must be supported by 
changes in standards development methodol- 
ogies and harmonization. Legacy systems are 
not easily discarded. Recommendations for 
complete replacement of existing standards 
are neither politically expedient nor fiscally 
supportable. Currently, there is increasing 
attention to new approaches to standards 
development that speeds the creation process 
and improves the quality of standards that 
are developed. These evolving development 
platforms pay appropriate homage to existing 
standards and leverage previously developed 
models of development and analysis. 

The use of the FHIR may provide a much 
needed solution while relying upon historically 
developed and refined interoperability specifi- 
cations, it hides the complexity of authoring 
messages within the FHIR development pro- 
cess. This leads to more usable specifications, 
created in a dramatically abbreviated time 
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frame. Other approaches to standards devel- 
opment, such as those focused upon services, 
are rapidly evolving. These services-aware 
architectures are governed by strict develop- 
ment principles that help ensure both interop- 
erability and the ability of components to be 
reused. 

Increasingly large data stores (“big data”) 
have demanded some of these changes. These 
data have emanated from a highly diverse 
universe of scientific development. In fact, 
some of the new bio-analytic platforms for in 
vitro cellular research are generating data at a 
rate, which by some estimates, is faster than 
the data can be analyzed. Medical images, 
for which storage requirements are growing, 
must now be principally evaluated by human 
inspection. Newly evolving algorithms and 
the technologies to support them, initially 
developed for “star wars” type image analy- 
sis, are replacing radiologist and pathologists 
for the establishment of diagnoses. These 
machines have proven to be faster and more 
accurate than their human counterparts. In 
the very near future, such instrumentation 
will supplant medical scientists the same way 
that comparable technologies replaced human 
inspection in the estimation of cell differen- 
tials for blood counts. These new technologies 
are demanding the development of specifica- 
tions and the terminologies to support them. 

Tomorrow’s technologies will transition 
from early vision through prototyping to com- 
mercial products in a more compressed life 
cycle. A model for this process in biomedical 
science was established with the emergence of 
the Human Genome Project (see > Chap. 11). 
Within the next decade, routine genome deter- 
mination and archiving, as well as their appli- 
cation to disease management, will require 
greatly enhanced solutions for data manage- 
ment and analysis. Innovative strategies for 
recognizing and validating biomarkers will 
grow exponentially from the current stable of 
imaging and cell surface determinants. These 
data streams will require adaption of exist- 
ing decision support systems and compara- 
tive effectiveness paradigms. Lastly, scientific 
evidence supporting the diagnosis and man- 
agement with the field of behavioral medi- 
cine will change the entire clinical spectrum 
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and approach to evaluation and care. As we 
emerge from the dark ages of behavioral med- 
icine, we will certainly require new systems for 
recognizing, diagnosing, naming and inter- 
vening on behalf of our patients. 

In some sense, the development of stan- 
dards is just beginning. The immediate future 
years will be important to create effective 
organizations that include the right experts 
in the right setting to produce standards that 
are in themselves interoperable. That goal still 
remains in the future. 


(e) Suggested Reading 

Abbey, L. M., & Zimmerman, J. (Eds.). (1991). 
Dental informatics: Integrating technology into 
the dental environment. New York: Springer. 
This text demonstrates that the issues of stan- 
dards extend throughout the areas of applica- 
tion of biomedical informatics. The standards 
issues discussed in this chapter for clinical 
medicine are shown to be equally pertinent for 


dentistry. 
American Psychiatric Association Committee on 
Nomenclature and Statistics. (1994). 


Diagnostic and statistical manual of mental 
disorders (4th ed.). Washington, DC: The 
American Psychiatric Association. Argonaut 
Project - HL7 FHIR, https://argonautwiki. 
hl7.org/Main_Page. 

Benson, T. (2012). Principles of health interopera- 
bility HL7 and SNOMED (2nd ed.). London: 
Springer. This book presents a detail discus- 
sion of the HL7 version 2 messaging standard 
and a detail presentation of SNOMED-CT. 

Braunstein, M. (2018). Health informatics on 
FHIR: How HL7’s new API is transforming 
healthcare. Springer. 

Boone, K. W. (2012). The CDA™book. London: 
Springer. This book provides an excellent pre- 
sentation of the HL7 Clinical Document 
Architecture and related topics. 

Chute, C. G. (2000). Clinical classification and 
terminology: some history and current obser- 
vations. Journal of the American Medical 
Informatics Association, 7(3), 298-303. This 
article reviews the history and current status 
of controlled terminologies in health care. 

Cimino, J. J. (1998). Desiderata for controlled 
medical vocabularies in the twenty-first cen- 
tury. Methods of Information in Medicine, 


37(4-5), 394-403. This article enumerates a set 
of desirable characteristics for controlled ter- 
minologies in health care. 

Executive Office of the President; President’s 
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Technology (2010). Report to the President 
realizing the full potential of health information 
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Holland, C., & Shostak, J. (2016). Implementing 
CDISC using SAS (2nd ed.). SAS Institute. 
ICD-10-CM 2020: The Complete Official 

Codebook, AMA, 2019 

Institute of Medicine. (2003). Patient safety: 
Achieving a new standard for care. Washington, 
DC: National Academy Press. Discusses 
approaches to the standardization of collec- 
tion and reporting of patient data. 

Jamie, G. (2019). Starting SNOMED: A begin- 
ner’s guide to the Snomed CT healthcare termi- 
nology. SNOMED. 

Kahn, A. N., et al. (2006). Standardizing labora- 
tory data by mapping to LOINC. JAMIA, 
13(3), 353-355. 

NCPDP Standards-based Facilitator Model for 
PDMP White Paper, NCPDP, 2019 ISO 
Technical Committee 215 Healthcare 
Informatics, https://en.wikipedia.org/wiki/ 
ISO/TC_215 

New York Academy of Medicine. (1961). Standard 
nomenclature of diseases and operations (Sth 
ed.). New York: McGraw-Hill. 

Pianykh, O. (2011). Digital imaging and communi- 
cations in medicine (DICOM): A practical 
introduction and survival guide. Springer. 

Richesson, R. L., & Andrews, J. E. (2012). Clinical 
research informatics. London: Springer. This 
book includes a discussion of standards and 
applications from the clinical research per- 
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spectives. 
Stallings, W. (1987). Handbook of computer- 
communications standard. New York: 


Macmillan. This text provides excellent details 
on the Open Systems Interconnection model 
of the International Standards Organization. 
Stallings, W. (1997). Data and computer communi- 
cations. Englewood Cliffs: Prentice-Hall. This 


text provides details on communications 
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architecture and protocols and on local and 
wide area networks. 


Q Questions for Discussion 


1. Who should be interested in 
interoperability and health data 
standards? 


2. What are the five possible approaches to 
accelerating the creation of standards? 

3. Define five health care standards, not 
mentioned in the chapter, which might 
also be needed? 

4. What role should the government play in 
the creation of standards? 

5. At what level might a standard interfere 
with a vendor’s ability to produce a 
unique product? 

6. Define a hypothetical standard for one 
of the areas mentioned in the text for 
which no current standard exists. 
Include the conceptualization and 
discussion points. Specifically state the 
scope of the standard. 
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Natural Language Processing for Health-Related Texts 


© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What are the potential uses for NLP in 
the biomedical, clinical, and health 
domains? 

= What are the principal computational 
tasks of NLP for health-related texts? 

= What are the different knowledge 
resources and linguistic representations 
that can support the development of 
NLP techniques? 

= What are the near-future directions for 
health-related NLP research and appli- 
cations? 


81 Motivation 

Language is the primary means of human 
communication, and it is no surprise clini- 
cians, biomedical researchers, patients, and 
health consumers alike rely on language 
extensively. Clinicians document the care of 
their patients in the electronic health record 
and use patient record notes to determine next 
steps of care. Biomedical scientists write and 
read articles to keep abreast of research prog- 
ress from their peers. Patients rely on online 
platforms to learn from and exchange infor- 
mational and emotional support from their 
peers, and health consumers view health con- 
tent online as a primary source of informa- 
tion to manage their health and increase their 
health literacy. In fact, there are continuously 
growing, unprecedented amounts of these 
biomedical and health-related texts available. 

The field of health and biomedical natu- 
ral language processing is concerned with 
the theories, principles, and computational 
approaches to building tools that exploit these 
textual data and support these stakeholders — 
patients and health consumers, clinicians, and 
biomedical researchers — in their information 
needs. 

While there is valuable information con- 
veyed in biomedical, clinical, and health texts, 
itis notin a format directly amenable to further 
processing. These texts are difficult to process 
reliably because of the inherent characteris- 
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tics and variability of language. While in most 
automated applications, structured, standard- 
ized data are readily available for process- 
ing, there is a significant amount of manual 
work currently devoted to mapping textual 
information to coded representations in bio- 
medicine and health care: Professional coders 
assign billing codes corresponding to diag- 
noses and procedures to hospital admissions 
based on discharge summaries and admission 
information; indexers at the National Library 
of Medicine assign MeSH (Medical Subject 
Headings) terms to represent the main top- 
ics of scientific articles; and database curators 
extract genomic and phenotypic information 
on organisms from the literature. Because of 
the overwhelmingly large amount of textual 
information in health domains, manual work 
is costly, time-consuming, and impossible to 
keep up to date. One aim of Natural Language 
Processing (NLP) is to facilitate these tasks by 
enabling use of automated methods with high 
validity and reliability. 

Another aim of NLP is to help advance 
many of the fundamental aims of biomedical 
informatics, which are the discovery and vali- 
dation of scientific knowledge, improvement 
in the quality and cost of health care, and 
support to patients and health consumers. 
The considerable amounts of texts amassed 
through clinical care, published in the scien- 
tific biomedical literature, and discussed by 
patients and caregivers online can help acquire 
new knowledge and promote discovery of new 
phenomena. For instance, the information in 
patient notes, while not originally entered for 
discovery purposes, but rather for the care of 
individual patients, can be processed, aggre- 
gated and mined to discover patterns across 
patients. This process of leveraging obser- 
vational health data has shown much suc- 
cess stories when applied to health data that 
are highly structured (OHDSIPNAS), and 
there is promise that incorporating learned 
representations of language can help as well 
(Ghassemi et al. 2014). 

For clinicians interacting with an elec- 
tronic health record and treating a particular 
patient, NLP can support several points in a 
clinician workflow: when reviewing the patient 
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Active (initiated by the application) and passive (initiated by the users) decision support applications to 


which NLP tools have contributed and have potential to contribute in the future 


chart, NLP can be leveraged to aggregate and 
consolidate information spread across many 
notes and reports, and to highlight relevant 
facts about the patient. During the decision- 
making and actual care phase, information 
extracted through NLP from the notes can 
contribute to the decision support systems in 
the EHR as shown in @ Fig. 8.1 (Demner- 
Fushman et al. 2009). Finally, when health 
care professionals are documenting patient 
information, higher quality notes can be gen- 
erated with the help of NLP-based methods. 

For quality and administrative purposes, 
NLP can signal potential errors, conflicting 
information, or missing documentation in the 
chart. For public health administrators, EHR 
patient information can be monitored for syn- 
dromic surveillance through the analysis of 
ambulatory notes or chief complaints in the 
emergency room (Hripcsak et al. 2009). 

The expectations and requirements for 
NLP support evolve and grow due to suc- 
cesses and demonstrated potential, such as 
a tool to identify tests for EGFR (epidermal 
growth factor receptor) mutations deployed 


in VA EHR clinical notes (Lynch et al. 2019) 
or Medical Text Indexer that supports index- 
ers in assigning MeSH terms at the National 
Library of Medicine (Mork et al. 2017). New 
requirements arise in the domain of consumer 
language, due to the changes in consumers’ 
behavior, which in turn, are changing the 
dynamics of healthcare interaction (> Chap. 
11). NLP is crucial in enabling consumers to 
get to the right information, whether through 
access to clinical information or to informa- 
tion generated by their peers. NLP can sup- 
port health consumers and patients looking 
for information about a particular disease or 
treatment, by providing better access to rel- 
evant information, targeted to their informa- 
tion needs, and to their health literacy levels 
through the analysis of the topics conveyed in 
a document as well as the vocabulary used in 
the document. 

When approaching the above health- 
related tasks or use cases for NLP, it is 
beneficial to align the human tasks and end- 
goals with the tasks that NLP has to per- 
form. For example, giving a clinician a full 
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and concise description of a patient’s course 
relevant to the patient’s current state is the 
task of summarization. The high-level NLP 
tasks or applications described later in this 
chapter include, but are not limited to, ques- 
tion answering (QA), summarization, text 
labeling and text generation. These tasks are 
bringing the field closer to the ultimate goal 
of text understanding, which encompasses 
not only direct understanding of textual 
information, but also the author’s attitude, 
such as sentiment polarity and modality at 
the document level, and temporal reasoning 
at adocument and document sequence levels 
(> Chap. 23). 

Across all these use cases of health NLP 
(hNLP), the techniques of natural language 
processing provide a means to bridge the 
gap between unstructured text and data by 
transforming the text to data in a computable 
format, allowing humans to interact using 
familiar natural language, while enabling 
computer applications to process data effec- 
tively and to provide users with easy access 
and synthesis of the raw textual information. 

This chapter is organized with two types 
of readers in mind: students and researchers 
looking for a broad introduction to health 
NLP prior to delving into this active field 
of research, and informatics practitioners 
looking to use hNLP technology for specific 
tasks or types of text. > Section 8.2 presents 
a more in-depth description of hNLP appli- 
cations and emphasizes the critical role that 
the context in which these applications are 
deployed plays when developing hNLP solu- 
tions. In » Sect. 8.3 we establish the basic 
computational tasks involved in most hNLP 
applications. > Section 8.4 is concerned with 
the different linguistic knowledge resources 
and types of linguistic representations that 
can enable and facilitate these basic NLP 
tasks. > Section 8.5 provides further practi- 
cal considerations for users of hNLP tech- 
nology, while » Sect. 8.6 provides further 
research considerations for hNLP research, 
including evaluation methodology. Finally, 
> Section 8.7 briefly outlines future direc- 
tions of hNLP. 
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82 NLP Applications and Their 


Context 


In this section, we describe specific hNLP 
applications that have been and continue to 
prove being useful to biomedical, clinical, and 
health stakeholders in their information needs 
(> Sect. 8.2.1). We then abstract away from 
them and emphasize the essential role that 
context in which these applications exist play 
(> Sect. 8.2.2). 


8.2.1 NLP Applications 


Natural language processing has a wide range 

of potential applications in the biomedical 

and health domains. The following are impor- 

tant applications of NLP technology for bio- 

medicine and health: 

= Information extraction locates and struc- 
tures specific information in text, some- 
times by performing a complete linguistic 
analysis of the text, but more frequently, 
by looking for patterns in the text. This is 
the most common application in biomedi- 
cine. It is also one of the earliest: In the 
1970s, the Linguistic String Project (LSP) 
under the leadership of Dr. Naomi Sager, 
a pioneer in NLP, developed a comprehen- 
sive computer grammar and parser of 
English (Grishman et al. 1973; Sager 
1981), and also began work in NLP of 
clinical reports (Sager 1972, 1978; Sager 
et al. 1987) that continued into the 1990s. 
Other clinical and biomedical NLP sys- 
tems followed, e.g., MedLee (Medical 
Language Extraction and Encoding 
System) has been used successfully pri- 
marily for clinical information extraction, 
but also adapted to literature processing 
(Friedman et al. 1994; Hripcsak et al. 
1995; Friedman 2000). Other systems that 
have been successfully used for extraction 
of information from clinical notes and the 
literature include MetaMap (Aronson and 
Lang 2010), MetaMap Lite (Demner- 
Fushman et al. 2017) and SemRep 
(Kilicoglu et al. 2012). The latter three sys- 
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tems developed at the National Library of 
Medicine rely on the Unified Medical 
Language system (UMLS) (Lindberg et al. 
1993a) described in > Sect. 8.4. Other sys- 
tems that rely on the UMLS include 
cTAKES (clinical Text Analysis and 
Knowledge Extraction System) (Savova 
et al. 2010) and CLAMP (Clinical 
Language Annotation, Modeling, and 
Processing) (Soysal et al. 2017). cCTAKES 
combines machine learning and rule-based 
methods to perform clinical information 
extraction tasks, whereas CLAMP pro- 
vides a graphic user interface to build cus- 
tomized NLP pipelines for clinical 
applications. 

Once textual information is extracted 
and structured, it can be used for a number 
of different tasks. In inferring social deter- 
minants of health, for instance, one can 
extract social risk factors from clinical 
notes (Conway et al. 2019) or from social 
media (De Choudhury et al. 2013). The 
extracted data, when collected across 
many patients, can help understand the 
prevalence as well as the progression of a 
particular disease at the community level 
(Eichstaedt et al. 2015). Notably, the above 
system for detection of social determi- 
nants of health in clinical notes continues 
the lineage of clinical NLP systems at the 
University of Utah, which started with 
SPRUS (which evolved into Symtext and 
then MPLUS, ONYX, and TOPAZ) 
(Haug et al. 1990, 1994; Christensen et al. 
2002; Dublin et al. 2013; Ye et al. 2014). In 
biology, biomolecular interactions 
extracted from one article or from differ- 
ent articles can be merged to construct 
biomolecular pathways. @ Figure 8.2 
shows a pathway in the form of a graph, 
which was created by extracting interac- 
tions from one article published in the 
journal Ce// (Maroto et al. 1997). The 
DARPA Big Mechanism program that 
aimed to assemble automatically the 
causal fragments found in individual sci- 
entific papers, such as in @ Fig. 8.2, into 
pathways, demonstrated the successes and 
the remaining challenges in the first step of 


Myf-5 


Troponin 


wen ses; 
N Myod 


Shh S 
<—— wat Polymerase 
Pax-7 


O Fig. 82 A graph showing interactions that were 
extracted from an article. A vertex represents a gene or 
protein, and an edge represents the interaction. The 
arrow represents the direction of the interaction so that 
the agent is represented by the outgoing end of the 
arrow and the target by the incoming end 


the process: machine reading of the litera- 
ture (Cohen 2015). 

The techniques for information extrac- 
tion may be limited to the identification of 
names of people or places, dates, and 
numerical expressions, or to certain types 
of terms in text (e.g. mentions of medica- 
tions or proteins), which can then be 
mapped to canonical or standardized 
forms. This is referred as named-entity rec- 
ognition and named-entity normalization, 
respectively. More sophisticated tech- 
niques identify and represent the modifiers 
attached to a named entity. Such advanced 
methods are necessary for reliable retrieval 
of information because the correct inter- 
pretation of a biomedical term typically 
depends on its relation with other terms in 
a given sentence. For example, the term 
fever has different interpretations in no 
fever, high fever, fever lasted 2 days, and 
check for fever. Defining the types of mod- 
ifiers of interest (e.g. no is a negation mod- 
ifier, while /asted 2 days is a temporal 
modifier), as well as techniques to recog- 
nize them in text, is an active topic of 
research that was in part stimulated by 
public release of tools and algorithms, 
such as NegEx (Chapman et al. 2001). 
Identifying relations among named enti- 
ties is another important information 
extraction method. For example, when 
extracting adverse events associated with a 
medication, the sentences “the patient 
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developed a rash from amoxicillin” and 
“the patient came in with a rash and was 
given benadryl” must be distinguished. In 
both sentences, there is a relation between 
a rash and a drug, but the first sentence 
conveys a potential adverse drug event 
whereas the second sentence conveys a 
treatment for an adverse event. As entities 
are extracted within one document or 
across documents, one important step 
consists of reference resolution, that is, 
recognizing that two mentions in two dif- 
ferent textual locations refer to the same 
entity (Kilicoglu and Demner-Fushman 
2016). In some cases, resolving the refer- 
ences is very challenging. For instance, 
mentions of stroke in two different notes 
associated with the same patient can refer 
to the same stroke or two different strokes; 
additional contextual information and 
domain knowledge is often needed to 
resolve this problem. 

Information retrieval (IR) and NLP over- 
lap in some of the methods that are used. 
IR is discussed in > Chap. 23, but here we 
discuss the basic differences between IR 
and NLP. IR methods are generally geared 
to help users access documents in large 
collections, such as electronic health 
records, the scientific literature, or the 
Web. This is a crucial application in bio- 
medicine and health, due to the explosion 
of information available in electronic 
form. The essential goal of information 
retrieval is to match a user’s query against 
a document collection (usually using an 
index) and return a ranked list of relevant 
documents or the best matching snippets 
of text. The most basic form of indexing 
isolates simple words and terms, and there- 
fore, uses minimal linguistic knowledge. 
More advanced approaches use NLP- 
based methods similar to those employed 
in information extraction, identifying 
complex named entities and determining 
their relationships in order to improve the 
accuracy of retrieval. For instance, one 
can search for hypertension and have the 
search operate at the concept level, return- 
ing documents that mention the phrase 
high blood pressure in addition to the ones 
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mentioning hypertension only. In addition, 
one can search for hypertension in a spe- 
cific context, such as in the context of 
treatment or context of etiology. 

Question answering (QA) involves a pro- 
cess whereby a user submits a natural lan- 
guage question, which is then automatically 
answered by a QA system. The availability 
of information in journal articles and on 
the Web makes this type of application 
increasingly important as health care con- 
sumers, health care professionals, and bio- 
medical researchers frequently search the 
Web to obtain information about a disease, 
a medication, or a medical procedure. A 
QA system can be very useful for obtaining 
the answers to clinical questions, like “In 
children with an acute febrile illness, what 
is the efficacy of single-medication therapy 
with acetaminophen or ibuprofen in reduc- 
ing fever?” (Demner-Fushman and Lin 
2007). QA systems provide additional 
functionalities to an IR system. In an IR 
system, the user has to translate a question 
into a list of keywords and generate a 
query, but this step is carried out automat- 
ically by a QA system. Furthermore, a QA 
system presents the user with an actual 
answer (often one or several passages 
extracted from the source documents), 
rather than a list of relevant source docu- 
ments. QA has focused for the most part 
on the literature (Demner-Fushman and 
Lin 2007; Cao et al. 2011), however, 
research on answering clinician’s questions 
asked of EHR (Roberts and Patra 2018) 
and answering consumer health questions 
(Demner-Fushman et al. 2020; Ben 
Abacha et al. 2019) is burgeoning. 

Text summarization takes one or several 
documents as input and produces a single, 
coherent text, which synthesizes the 
main points of the input documents. 
Summarization helps users make sense of 
a large amount of data, by identifying and 
presenting the salient points in texts auto- 
matically. Summarization can be generic 
or query-focused (i.e. taking a particular 
information need into account when 
selecting important content of input docu- 
ments). Query-focused summarization can 
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be viewed as a post-processing of IR and 
QA: the relevant passages corresponding 
to an input question are further processed 
into a single, coherent text. Several steps 
are involved in the summarization process: 
content selection (identifying salient pieces 
of information in the input document(s)), 
content organization (identifying redun- 
dancy and contradictions among the 
selected pieces of information, and order- 
ing them so the resulting summary is 
coherent), and content re-generation (pro- 
ducing a text output from the organized 
pieces of information). Text summariza- 
tion in the biomedical domain has focused 
on the literature (Elhadad et al. 2005; 
Zhang et al. 2011), with some forays into 
summarization of clinical text and Web 
resources (Pivovarov and Elhadad 2015; 
Mane et al. 2015). 


Other tasks: 
Text classification (labeling) involves 
categorizing text into known types. 


Sentences in scientific articles can be clas- 
sified as key sentences indicative of the 
article’s content (Ruch et al. 2007). 
Sections of a clinical note can be labeled, 
as, e.g., social or family history (Denny 
et al. 2008). At the document level, texts 
can be classified into different genres, etc. 
Related to classification is clustering, 
which involves grouping texts based on 
some intrinsic similarity without knowing 
a priori what these similar properties are. 
Text generation formulates natural lan- 
guage sentences from a given source of 
information, which is not directly readable 
by humans. Generation can be used to cre- 
ate a text from a structured database, such 
as summarizing trends and patterns in 
laboratory data (Hüske-Kraus 2003). 
Machine translation converts text in 
one language (e.g. English) into another 
(e.g. Spanish). These applications are 
important in multilingual environments in 
which human translation is too expensive 
or time consuming (Deléger et al. 2009a). 
Text readability assessment and simpli- 
fication is becoming relevant to the health 
domain, as patients and health consumers 
access more and more medical informa- 


tion on the Web, but need support because 
their health literacy levels do not match 
the ones of the documents they read 
(Elhadad 2006; Keselman et al. 2007). 


Finally, sentiment analysis and emotion detec- 


tion belong to the general task of automated 
content analysis (Zunic et al. 2020). 


8.2.2 Context for NLP Applications 


Understanding the context and intent of the 
speaker on meaning (pragmatics) is crucial 
to performing many NLP tasks correctly. 
Although the context of health-related NLP 
applications is broad and varied, it is bounded 
by specific tasks and environments that can 
be relatively easily enumerated and therefore 
taken into account during processing. It is 
defined by those who produce the text, by the 
purposes for which the text is produced, and 
by the intended readers. For example, clini- 
cians can write a patient status report for their 
colleagues, or a simplified summary for the 
patient. Clinical researchers can describe the 
arms of a clinical trial in a scientific publica- 
tion or describe the inclusion criteria of the 
same trial in a simpler language for recruit- 
ment and patient education purposes. The 
style of communication is often determined by 
cultural conventions and ecosystems in which 
these texts are written and read, and inference 
is needed for correct interpretation and gen- 
eration of language. Powerful context models 
are missing in the open domain (Bunt 2017) 
but can be approximated through the seman- 
tic lexicon and rules about the discourse of 
a text in the biomedical domain. Biomedical 
sublanguages are easier to interpret than 
general languages because they exhibit more 
restrictive semantic patterns that can be repre- 
sented more easily (Harris et al. 1989; Harris 
1991; Sager et al. 1987). Sublanguages tend to 
have a relatively small number of well-defined 
semantic types (e.g. medication, gene, disease, 
body part, or organism) and a small number 
of semantic patterns (e.g. medication-treats- 
disease, gene-interacts with-gene). 

The fact that texts belong to a particular 
domain, be it clinical, biological or related 
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to health-consumers, allows us to capture 
domain-specific characteristics in the lexi- 
con, the grammar, and the discourse struc- 
ture. Thus, the more specific the domain of a 
text, the more knowledge can be encoded to 
help its processing, but then the NLP system 
would be extremely limited and specialized. 
For instance, in the domain of online patient 
discourse, patients discussing breast cancer 
among their peers online rely on a very dif- 
ferent set of terms than caregivers of children 
on the autism spectrum. One can develop a 
lexicon for each subdomain, online breast 
cancer patients and online autism caregivers. 
But maintaining separate lexicons can be inef- 
ficient and error prone, since there can be a 
significant amount of overlap among terms 
across subdomains. Conversely, if a single 
lexicon is developed for allsubdomains, ambi- 
guity can increase as terms can have different 
meanings in different subdomains. For exam- 
ple, in the emergency medicine domain shock 
will more likely refer to a procedure used for 
resuscitating a patient, or to a critical condi- 
tion brought about by a drop in blood flow, 
whereas in psychiatry notes it will more likely 
denote an emotional condition or occasionally 
electric shock therapy. Deciding on whether 
to model a domain as a whole or to focus on 
its subdomains independently of each other is 
a tradeoff. Careful determination of the use 
cases of a system can help determine the best 
choice for the system. 


83 Basic Computational 


NLP Tasks 


The different applications we reviewed in the 
previous sections have in common the need to 
process text, but they differ widely in the types 
of input and output they produce. Tasked 
with an application that entails text process- 
ing, one can think in these terms: is the goal to 
cluster, label, extract, generate or a combina- 
tion of these? And what is my input: a corpus, 
a document, a fragment of text, or a sequence 
of words? The step of casting an application 
into one or a set of NLP tasks is important 
in determining the choice and design of the 
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approaches. An evaluation of the application 
then helps determine which design was the 
more appropriate. There is not always a single 
solution to the application casting step. For 
instance, take the task of identifying within a 
large pool of patient records the notes with 
documented heart attack in the past year. 
We can cast this application as a note label- 
ing task: given a note, label it according to a 
binary label: documentation present or absent. 
In this case, the application will retrieve notes 
of patients that are labeled as a whole as likely 
to contain documentation of heart attack 
within the past year or not. To achieve the 
task, we will then need to compile a training 
set of notes and their gold-standard labels 
and train a document-level classifier. An alter- 
native approach is to cast the same applica- 
tion as an information extraction or template 
filling task. There, the template would contain 
for instance the location in the note that men- 
tions the condition heart attack, along with 
the temporal expression documenting the 
time at which the heart attacks occurred in 
the patient. In this case, all heart attacks and 
temporal aspects will be extracted in our pool 
of patient notes, and only the ones that sat- 
isfy our temporal constraint will be kept. Both 
tasks will achieve the same goal (identify the 
patients who have a documented heart attack 
in the past year), but will do so in different 
ways and with slightly different outputs: the 
labeling approach will not be able to provide 
data provenance, but might be more feasible 
because it is easier to label a document as a 
whole than it is to annotate templates and 
extract information as shown in Ø Fig. 8.3. 
In this section, we review the basic tasks 
that are most common when processing 
health-related texts and in biomedical and 
health applications. These tasks do not con- 
stitute a pipeline, but rather a portfolio of 
the tasks one can rely on depending on their 
application and goals. They each take specific 
inputs (sometimes a collection of documents 
as in topic modeling, or a sequence of words 
as in sequence labeling) and produce specific 
outputs (groups of words in topic modeling, 
one label in document labeling, triples in rela- 
tion extraction, or complex frames in event 
extraction, as shown in @ Fig. 8.3). 
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Information 


Which patients extraction 


had a heart 


Document 


classification 


Training set 
annotation: 
yes/no 


Classifier 


Training set 
annotation 


He was released from the 
hospital four days ago after 
having suffered an acute 
MI of the left circumflex one 
week prior 


attack in the 


Pt ID 6 


Lexical, syntactic, 
semantic processing, 
machine learning 


past year? 
Pt ID 1 

Pt ID 1 

acute MI on 
2/10/2020 


Pt ID 1 
doclD 5 


Chart date 
2/20/2020 


acute MI 
one week prior 


Did this patient have a 
heart attack in the past 
year? 


yes 


O Fig.8.3 Two different approaches to identifying patients who suffered a heart attack in the past year. The docu- 
ment labeling/classification approach and the template filling/information extraction approach 


8.3.1 Topic Modeling 


Topic modeling takes as input a collection of 
texts and identifies the topics discussed in the 
documents that comprise the collection. It 
is unsupervised in nature; that is, it does not 
require any guidance, document labels, or 
dictionary to identify topics. The discovered 
topics are expressed as clusters of words. As 
such, topic modeling is a particularly useful 
task for exploratory data analysis. A variety 
of methods have been proposed for the task 
of topic modeling, including latent semantic 
analysis (Deerwester et al. 1990), probabilis- 
tic latent semantic analysis (Hofmann 1999), 
and Latent Dirichlet Allocation (LDA) (Blei 
et al. 2003). All topic modeling methods take 
a simple representation of the documents, 
namely a “bag of words” approach. That is, 
the text is split into words using a tokenizer, 
and the order between the words is not taken 
into account during processing. 


LDA belongs to the family of probabilis- 
tic generative models, and several extensions 
to LDA, also in the family of probabilistic 
generative models, were proposed in the lit- 
erature and can facilitate exploratory analysis 
of large corpora further (Blei 2012). The gen- 
erative process of LDA and its variant is what 
enables topic modeling to go further than a 
simple exploratory device within a corpus. In 
addition to discovering the clusters of words 
that determine the topics of a corpus, it is 
able to infer for any new document the topics 
which best represent it. 

For instance, @ Fig. 8.4 depicts the result 
of applying LDA for 20 topics on a corpus 
of 1700 documents. Each document in the 
corpus contains the title and abstract of a 
scientific publication from PubMed Central. 
Each topic consists of a ranked list of words, 
where the ten most likely words are presented. 
Through examining the topics, one can get a 
quick overview of the content of the corpus. 


Natural Language Processing for Health-Related Texts 


251 


Topic 19 Topic 20 
suryelllance 
detection 
influenza 
fespiratory 
department 
infections 
es 
emenency 
chief 
public 


O Fig. 8.4 Topic modeling is an unsupervised basic 
computational tasks in hNLP. Given a corpus, two types 
of output are generated: topics defined as distributions 


In our example, we see that our documents 
are a mixture of articles in the fields of clini- 
cal, public health, informatics, and biological 
domains. Themes that emerge from the analy- 
sis include obesity (topic 1), microbiology 
(topic 19), and electronic health records (topic 
4). Because the topic modeling task is unsu- 
pervised, the discovered topics do not always 
depict actual themes, but can sometimes elicit 
groups of words representative of the genre 
of texts analyzed. Topic 3 in our example is 
such a topic, where the most highly ranked 
words are common to the genre of biomedi- 
cal studies. 

Beyond utility of topic modeling as explor- 
atory technique for large corpora, it is a clus- 
tering technique which can be found useful 
in many other applications. For instance, in 
a machine learning application that leverages 
both text and other types of data. 


8.3.2 Text Labeling 


In Text Labeling tasks, the input is text, either 
in its entirety or a fragment, and the output is 
a label or a set of labels. Examples of appli- 
cations in healthcare, consumer health, and 
biomedicine that use text labeling abound and 


over words in the corpus, and an inference mechanism 
to assign topic assignments to any new document 


some are already discussed in > Sect. 8.2, but 

we list here a few examples: 

= Automated coding of discharge summary 
according to diagnostic codes. This is an 
example of text labeling task where the 
input is an entire document — a discharge 
summary, the clinical note authored at the 
end of a hospital admission by a physi- 
cian — and the output is a set of diagnostic 
codes, where the codes are chosen from a 
taxonomy, e.g., ICD-10. This task is typi- 
cally carried out in hospital for billing and 
administrative goals, and the tasks are 
done in a semi-automated fashion with 
coders, professionals trained to select the 
appropriate codes (Resnik et al. 2006). 

= Sentiment analysis of hospital reviews 
written by health consumers is another 
example of text labeling task (Greaves 
et al. 2013). There, a short text written in 
lay language is labeled according to a sin- 
gle label - its polarity or sentiment. In fact, 
it might even be a fragment of the text at a 
time that gets labeled according to its 
sentiment. 


Most approaches to this task use classification 
techniques from the field of machine learning 
and thus require a training set of text inputs 
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Knowledge-based NER 


SENTENCE/LINE 
SEGMENTATION 


PART-OF-SPEECH 


TOKENIZATION 
TAGGING 
TOKEN WINDOW DICTIONARY 
GENERATION LOOKUP 
TERM 
| NORMALIZATION 


O Fig. 85 Two approaches to named entity recogni- 
tion. The approach on the left is a schematic representa- 
tion of MetaMap - a knowledge-based tool for 
NER. The right side depicts an abstraction of a machine 
learning approach to NER. Provided with the following 
note as input: “Indication: Ca history, exam for meta- 
static disease, Impression: 1.5 cm nodule in the left mid- 
lung zone may contain calcium.” MetaMap will output 
the following: 


and their associated labels, although rule- 
based approaches are also a possibility, e.g., 
in labeling sections of clinical notes (Denny 
et al. 2008). Which features to include and 
how to represent them depends on the task 
at hand. For instance, in automated diagnosis 
coding, word-level representations described 
in > Sect. 8.4 have been found helpful, as well 
as simple bag-of-words approaches. 


8.3.3 Sequence Labeling 


The task of sequence labeling is a specific case 
of text labeling. In sequence labeling, we are 
keeping track of the order in which textual units 
occur (be it words or text fragments), and the 
approaches work better at labeling the entire 
sequence jointly. Here are two examples of 
sequence labeling tasks in health-related texts. 
We focus on a traditional example for 
sequence labeling, namely, named entity rec- 


Machine Learning approach to NER 


exam for metastatic disease 


Annotate training data, 
represent in required format 


Exam Outside (O) 


for Outside (O) 


metastatic 
disease 


Beginning Disease (B) 
Inside Disease (I) 


y Train classifiers 


History of 
Hodgkin's 
lymphoma 


INPUT TEXT 


RESULTS 


— C0582103: exam: Medical Examination : [hlca]: 0 : 4 
: 0.0 

— C0027627: metastatic disease: Neoplasm Metastasis 
: [neop]: 9 : 18 : 0.0 

— C4284036: exam: Exam : [ften]: 0: 4: 0.0 

— C2939420: metastatic disease: Metastatic Neoplasm 
: [neop]: 9: 18: 0.0 


ognition (NER). NER is a classic task of gen- 
eral NLP and in the health domain. The task 
consists of identifying within a text the dif- 
ferent mentions of different types of entities 
as shown in B Fig. 8.5. The task is difficult 
because we don’t know in advance how many 
words comprise the entity, e.g., lymphoma is 
a single word, Hodgkin's lymphoma is a two- 
word phrase, but many other terms include 
these entities: 
= non-Hodgkin’s lymphoma of lung, 
= Hodgkin’s lymphoma of lung (without the 
non), and 
= non-Hodgkin’s lymphoma. 


Each one of these phrases is a different term, 
and sometimes nested recognition might be 
needed, e.g., if a clinician is looking for all 
patients with lymphoma grouped by the type 
and location of the lesion, we will need to 
identify the full names, as well as the nested 
mentions of lymphoma. Sequence labeling is 
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also difficult because terms are often ambig- 
uous with respect to the type of entity they 
instantiate. The context in which the term 
occurs might help identify the correct entity 
type and link the entity to its class or an 
appropriate identifier in a terminology. For 
instance, the term “ca” in itself in the clinical 
text is ambiguous towards two frequent entity 
types: (as in calcium) measurement or the 
neoplasm UMLS semantic type (as in cancer). 
The output provides UMLS identifiers, 
positional information of the terms in the text, 
and their semantic types. Note that in addi- 
tion to ambiguity of Ca that will not be easy 
to disambiguate to cancer because the note 
also contains the term calcium, Exam is also 
ambiguous due to the different semantic types 
assigned to physical exam. Also note the fine- 
grained differences in normalizing metastatic 
disease. The output of the machine learning 
model trained to label disease name sequences 
will identify Hodgkin’s lymphoma as disease. 
Like in text labeling, sequence labeling 
rely on supervised approaches. Training data 
with sequences fully annotated are required 
as training examples, see @ Fig. 8.5. While 
probabilistic approaches like Hidden Markov 
Models and Conditional Random Fields 
were historically used for this task, neural 
architectures have shown impressive results 
in sequence learning. Recurrent neural net- 
works (LSTM, bi-LSTM, GRUs) in particu- 
lar encode in their architecture the ability to 
handle almost arbitrarily long sequences and 
keep track of the relevant words within them 
(whether long-term dependency or short-term 
dependency) (Habibi et al. 2017). 


8.3.4 Relation Extraction 


Similarly to NER, relation extraction can be 
decomposed into relation detection and deter- 
mination of the relation type. Until recently, 
research in biomedical relation extraction was 
limited to protein-protein interactions and 
gene binding, primarily due to availability of 
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resources created for BioCreative evaluations 
(Krallinger et al. 2008). In the clinical domain, 
relations of interest are those between medical 
problems and treatments, problems and tests, 
problems and genes, and genes and treatments 
(the latter two becoming increasingly impor- 
tant due to interest in translational research 
and precision medicine). The resources for 
extraction of some of these relations are avail- 
able due to the i2b2 relation extraction chal- 
lenge (Uzuner et al. 2011). Another source of 
literature partially annotated for entities and 
relations is the pharmacogenomics knowledge 
base (PharmGKB) (Barbarino et al. 2018). 
Among other activities, PharmGKB provides 
literature annotated for genetic variants and 
gene-drug-disease relationships and anno- 
tates associations between genetic variants 
and drugs, and drug pathways. 

Initially, the existence of relations between 
entities was assumed if the entities co-occurred 
more frequently than by chance in a unit of 
text, such as a MEDLINE abstract or a sen- 
tence. Mutual information, chi-square and 
log-likelihood ratio were often used to extract 
co-occurrence-based relations (Hakenberg 
et al. 2012). This approach has two draw- 
backs: (1) the results are often noisy and (2) 
the nature of the relations is undefined. 

Both the knowledge-based and statistical 
approaches are used to extract specific rela- 
tions. Knowledge-based approaches often 
involve lexical-semantic or syntactic-semantic 
patterns. For example, to extract a relation- 
ship between a complication of a patient’s 
health condition and its cause, we can define a 
“Complications of” relation. The expressions: 
“status post”, “secondary to”, and others 
indicate the presence of the “Complications 
of” relation. These expressions can be com- 
bined with semantic categories to form pat- 
terns for a rule-based system. For example, 
the “[concept Problem][s/p][concept any|word 
noun]” pattern, in which “s/p” represent the 
set of the indicator expressions. A rule-based 
system that extracts “treats” relations can 
include the following rule: 


If dependency path contains a treatment indicator, Procedure and Problem = (treats, 


Procedure, Problem) 
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where treatment indicators are stored in a 
gazetteer and Procedure and Problem are 
identified by one of the NER tools presented 
above. As common for the rule-based meth- 
ods, this approach has relatively low recall 
and is potentially brittle. The currently better- 
performing systems use supervised machine 
learning. Most supervised machine learning 
methods assume that the entities are identified 
by a NER tool and require positive examples 
in which the relations and entities are anno- 
tated and negative examples in which there 
are no relations between annotated entities. 
Given these examples, classifiers (Support 
Vector Machines in the near past, and cur- 
rently Deep Learning frameworks (Peng et al. 
2018)) are trained to determine if a specific 
relationship between the candidate entities 
exists. If several relations are possible between 
the two entities, for example, drug causing a 
problem or drug treating a problem, a two- 
step approach can be applied: first determin- 
ing if a relation exists; and then determining 
the type of the relation. As with other clas- 
sification tasks, feature selection is one of 
the factors determining the success of the 
method. In addition to the standard “bag of 
words” in the windows preceding, following, 
and between the two concepts, features often 
used in relation extraction are: the semantic 
types of the concepts; the distance between 
the concepts, parts of speech, paths to the 
root of the parse tree and dependency rela- 
tions between the concepts. 


8.3.5 Template Filling 


Often, n-ary relations, e.g., the size, the loca- 
tion and the borders of a lesion, are of inter- 
est. Capturing such information requires 
event (frame) extraction and is viewed as tem- 
plate filling task. 

Biomedical events involve a change in the 
state of biomedical objects. Examples of the 
events are gene expression, protein binding 
and regulation in the biological domain and 
medication or phenotype events in the clini- 
cal domain. Events usually involve multiple 
entities and relations between the entities and 
can be nested. Medication events, as shown in 


O Fig. 8.6, exemplify relatively simple events: 
a medication event involves a drug, its form, 
dosage, administration route, duration, and 
indications. 

The complex biomolecular events are usu- 
ally more involved. For example, consider the 
phrase “SYK-TLR4 binding increases upon 
TLR4 dimerization and phosphorylation” 
(PMID: 22776094) that introduces a complex 
positive regulation event in which binding of 
spleen tyrosine kinase (SYK) to the toll-like 
receptor-4 (TLR4) is regulated by two simple 
events: TLR4 dimerization and phosphoryla- 
tion. Secondary arguments of the events may 
provide additional information about the 
event, such as the specific domain or region 
of the theme of the event. For example, in 
“binding of SYK to the cytoplasmic domain 
of toll- like receptor-4 (TLR4)”, cytoplasmic 
domain is the secondary argument associated 
with the TLR4 theme of the binding event. 

Many systems for event recognition are 
built as pipelines that start with recognizing 
protein names. For example, one approach to 
extracting biomedical events on PubMed scale 
(Björne et al. 2010) is to use publicly available 
NER tools as the first step in extraction of 
protein events from PubMed abstracts. In the 
subsequent steps, the extraction pipelines rely 
on the graph representations of sentence syn- 
tax and semantics, which we will present in the 
next section. The syntactic graph generated by 
a dependency parser and the identified named 
entities are used to generate a semantic graph 
for each sentence independently. The system 
uses the graphs as the source of features to 
supervised machine learning for event trigger 
and edge detection, which are followed by the 
rule-based event construction step. 


84 Linguistic Knowledge 


and Representations 


It is important to know the principles of lin- 
guistics and computational linguistics that, 
when incorporated in computational tasks, 
help achieve better results. Processing of lan- 
guage is not as simple as applying a pipeline 
of independent modules: one to determine 
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tokens, one to assign part-of-speech tags to 
tokens and to parse the syntax, one to inter- 
pret the meaning of a sentence, and one to 
resolve the discourse-level characteristics of 
the text. In reality, all linguistic levels influ- 
ence each other. Low-level decisions about 
how to tokenize a string impact named-entity 
recognition; determining which sense to attri- 
bute to a named entity depends on its place 
in the syntactic tree, the pragmatics of the 
text, and its place in the discourse structure. 
How to model these interactions is one of 
the primary open research questions of natu- 
ral language processing, which is currently 
addressed by modeling the tasks jointly, e.g., 
using deep learning approaches (James et al. 
2013). Although language processing is not a 
simple pipeline, the practical applications still 
often approximate the process as one, and lin- 
guistic knowledge contributes to performing 
the basic tasks and modeling the interactions. 

Linguistic levels consist of word-level 
representations (tokens and morphology; 
sentence-level representations (syntax and 
semantics); and document-level representa- 
tions (pragmatics and discourse). Linguistic 
knowledge is captured in lexicons, domain 
knowledge, e.g., lists of diseases and drugs 
can be found in terminologies, and domain 
semantics, such as relations among diseases 
and drugs comprises ontologies. 


8.4.1 Terminological 
and Ontological Knowledge 


In the biomedical and clinical domains, inter- 
preting text might require extensive back- 
ground knowledge, as many facts are implied, 
e.g., inferring that a patient is likely hyper- 
tensive or has an edema if loop diuretics are 
prescribed, even though there are no explicit 
mentions of either high blood pressure or 
swelling in the text. Ontologies contain some 
of the background knowledge, and some can 
be learned from text. 

The Unified Medical Language System 
(UMLS), including the Metathesaurus, 
Semantic Network, and the Specialist 


Lexicon) — can be used as a knowledge base 
and resource for NLP tasks. MedDRA and 
RxNorm, two of the over 130 sources contrib- 
uting to the UMLS, are examples of terminolo- 
gies specific to adverse events and medications, 
respectively. They are particularly helpful in 
the clinical domain, in biosurveillance, phar- 
macovigilance and in pharmacogenomics. The 
UMLS Metathesaurus is organized by con- 
cept. It preserves the meaning and structure 
of the contributing sources and links alterna- 
tive surface representations of a concept in 
many languages. It also establishes relation- 
ships between concepts. All concepts in the 
Metathesaurus are assigned to at least one 
Semantic Type from the Semantic Network. 
Most English strings in the Metathesaurus 
also appear in the SPECIALIST Lexicon. 
The Specialist lexicon provides detailed syn- 
tactic knowledge for words and phrases and 
includes a comprehensive medical vocabulary. 
It also provides a set of tools to assist in NLP, 
e.g., a lexical variant generator. 

Although ontologies and lexicons are 
maintained and regularly updated, the lan- 
guage changes, new concepts enter the lan- 
guage, and terms fall out of use and become 
obsolete (although data represented with 
the terms may persist). The biomedical and 
health domains are highly dynamic in the 
influx of new terms (e.g. new drug names, 
but also sometimes new disease names, like 
COVID-19, SARS and H1N1). Modern cor- 
pus-based approaches compensate for that 
lag leveraging the existing knowledge. For 
example, all diseases in the UMLS can be 
marked in PubMed and the representation of 
disease can be learned as described in the sec- 
tion on word embeddings. A new disease such 
as COVID-19 can then be recognized as such 
using its context and the learned representa- 
tion of Disease. 


8.4.2 Word-Level Representations 


a Tokens 
Tokens are basic language units defined based 
on their utility for solving a specific language 
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processing task. The units include mor- 
phemes, words (often morpheme sequences), 
numbers, symbols (e.g. mathematical opera- 
tors), and punctuation. The notion of what 
constitutes a token is far from trivial. The pri- 
mary indication of a token in general English 
is the occurrence of white space before and 
after it; however there are many exceptions: 
a token may be followed by certain punctua- 
tion marks without an intervening space, such 
as by a period, comma, semicolon, or ques- 
tion mark, or may have a “—” in the middle, as 
shown in > Example 8.1: 


» Example 8.1 


... a decoy receptor for IL-2 in the T cell ... 
... IL 2-regulated genes ... 
... I2 and Csf2 are increased as T-cells ... < 


The three snippets in » Example 8.1 contain 
three different spellings for IL2. Ideally, we 
would like to link all three surface forms to 
interleukin-2. To achieve this normalization, 
we need a tokenizer that will treat the dash 
and the white space equally and segment IL2 
into IL and 2 (or merge IL and 2 in the first 
two example snippets.) 

In biomedicine, periods and other punc- 
tuation marks can be part of words (e.g., p.o. 
means per os (by mouth; orally) in the clinical 
domain, and M03F4.2A, is a gene name that 
includes a period.) Moreover, punctuation 
marks are used inconsistently, thereby com- 
plicating the tokenization process: In clinical 
text, it is common to abbreviate “discontinue” 
as d/c, without as w/o, but it is also common 
to write s/p for status post, and, finally, use 
slashes in measurements and units. In addi- 
tion, chemical and biological names often 
include parentheses, commas and hyphens, 
for example (w)adh-2, which also compli- 
cate the tokenization process. For example, 
replacing non-alphanumeric characters with 
spaces will prevent us from correctly identify- 
ing entities in the following sentence: “PBMC 
of HLA-DR3(+) but not HLA-DR3(—) cured 
TB patients.” For agglutinative languages, 
tokenization needs to be augmented by seg- 
mentation as shown in > Example 8.2. 
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» Example 8.2 Rückschmerzen nach Brustwir- 
belbruch 


Ich habe oft Schmerzen im Rücken und in 
der Schulter. [Back pain after thoracic frac- 
ture: I often have pain in the back and in the 
shoulder]. < 


In this example blog post, Rückschmerzen in 
the title needs to be split into Riick [en] (back) 
and Schmerzen (pains). In addition, to identify 
the reason for back pain, information about 
the fracture (bruch) has to be separated from 
its location (thoracic spine — Brust — wirbel.) 

Some thought should be given to upper 
and lower case in the original text. In some 
situations, it makes sense to keep all tokens 
in lowercase, as it reduces variations in 
vocabulary. But in others, it might hinder 
further NLP: in both the biology and clini- 
cal domains, there are many acronyms, which 
when lowercased, might be confused with reg- 
ular words, such as CAT (computerized axial 
tomography) and cat or FISH (fluorescent in 
situ hybridization) and fish. 

Morphology concerns the combination of 
morphemes (roots, prefixes, suffixes) to pro- 
duce words or lexemes, where a lexeme gener- 
ally constitutes several forms of the same word 
(e.g. activate, activates, activating, activated, 
activation). There has been little work con- 
cerning morphology in the field of NLP in the 
biomedicine and health domains, especially 
for the English language. In other languages 
that are morphologically rich (e.g., Turkish, 
German, and Hebrew), encoding morpho- 
logical knowledge is necessary. For example, 
morphological proximity can identify impor- 
tant terminological relations (Claveau and 
L’Homme 2005) or generate definitions of 
medical terms (Deleger et al. 2009b). 


a Word Embeddings 

Word embedding is a feature learning tech- 
niques in NLP in which words or phrases from 
the vocabulary are numeric vectors such that 
similar words will have similar vectors. Word 
embeddings have two functions: they capture 
the meaning of a word using its context and, 
at the same time, condense the representation 
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of the word into a vector, e.g., a vocabulary of 
thousands of words can be compressed into a 
300-dimensional vector. The idea of “recog- 
nizing a word by the company it keeps” origi- 
nated from Firth (1957). One of the earliest 
corpus-based implementations, Brown clus- 
tering, aggregated words into classes using 
hierarchical clustering (Turian et al. 2010), 
and now deep learning provides more robust 
approaches to pre-computing representations 
of words using large corpora, e.g., 2.5 bil- 
lion Wiki words. The growing family of text 
embeddings and pre-trained language mod- 
els started with Word2Vec (Mikolov et al. 
2013), and now includes GloVe (Pennington 
et al. 2014), fastText (Bojanowski et al. 2016), 
ELMo (Peters et al. 2018), BERT (Devlin 
et al. 2018), GPT (Radford et al. 2019), and 
BART (Lewis et al. 2019), to name a few. The 
language models, such as BERT, have been 
shown to be open to fine-tuning for specific 
NLP tasks. Domain-specific embeddings and 
models trained on PubMed and clinical text 
also exist, and have been shown superior to 
those pre-trained on the open-domain text. 
BioBert, for example, significantly advanced 
the state-of-the-art for biomedical named 
entity recognition, relation extraction, and 
question answering (Lee et al. 2019). 


a Spelling Variants and Errors 

In addition to common American English 
and British spelling variants, such as -or/- 
our (e.g., color/colour), e/ae, e/oe, er/-re (e.g., 
liter/litre), and -ize/-ise, generic drug names 
may also differ (e.g., adrenaline (British) vs. 
epinephrine). The complex origins and spell- 
ing of the biomedical terms often lead to 
misspellings. Misspellings in the published 
literature are relatively rare, but the queries 
submitted to the search engines are often 
misspelled. Clinical notes and informal com- 
munications also often contain misspelled 
terms. Clinicians, when typing free text in the 
EHR, do so under time pressure and gener- 
ally do not have the time to proofread their 
notes carefully. In addition, they frequently 
use abbreviations (e.g. HF for Hispanic female 
or heart failure, 2/2 for secondary to or a date), 
many of which are non-standard and ambigu- 
ous. For patients and health consumers, when 


posting content online, misspellings, typos, 
and non-standard abbreviations are pervasive 
like in the rest of the social Web. Ignoring 
these variations may cause an NLP system to 
lose or misinterpret information. At the same 
time, errors can be introduced when correct- 
ing the typos automatically. For instance, it 
is not trivial to correct hypetension automati- 
cally without additional knowledge because 
it may refer to hypertension or hypotension. 
This type of error is troublesome not only 
for automated systems, but also for clinicians 
when reading a note, as this phenomenon is 
aggravated by the large amount of short, mis- 
spelled words in notes. In the clinical domain, 
misspellings can be found even in the defini- 
tions of clinical variables. 


8.4.3 Sentence-Level 
Representations 


=» Sentence Boundary Detection 
Detecting the beginning and end of a sentence 
may seem like an easy task, but it is highly 
domain dependent. Not all sentences end 
with a punctuation mark (this is especially 
true in texts with minimal editing, such as 
online patient posts and clinical notes entered 
by physicians). Sentences in scientific publica- 
tions are usually well-formed and delimited by 
final punctuation, primarily a period. Some 
care has to be taken to avoid breaking up sen- 
tences on periods used in abbreviations (e.g., 
vs.) and in honorifics, chemical names, and 
decimal numbers (as discussed in tokeniza- 
tion). In most cases an off-the-shelf sentence 
tokenizer is expected to be highly accurate. 

The informal biomedical text is much 
harder to split into meaningful utterances. 
Clinical notes often contain table-like struc- 
tures and lists. Moreover, some electronic 
health records might enforce a certain line 
length, which could be violated by, for exam- 
ple, a de-identification tool. Therefore, cus- 
tom solutions might be needed to detect end 
of sentences in these texts. 

Syntax concerns the categorization of the 
words in the language, and the structure of the 
phrases and sentences. Each word belongs to 
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one or more parts of speech in the language, 
such as noun (e.g. chest), adjective (e.g. mild), 
or tensed verb (e.g. improves) Lexemes can 
consist of more than one word as in foreign 
phrases (ad hoc), prepositions (along with), 
and idioms (follow up, on and off). Lexemes 
combine in well-defined ways and according 
to their parts of speech, to form sequences of 
words or phrases, such as noun phrases (e.g. 
severe chest pain), adjectival phrases (e.g. 
painful to touch), or verb phrases (e.g. has 
increased). Each phrase generally consists of a 
main part of speech and modifiers, e.g. nouns 
are frequently modified by adjectives, while 
verbs are frequently modified by adverbs. The 
phrases then combine in well-defined ways to 
form sentences (e.g. “he complained of severe 
chest pain” is a well-formed sentence, but 
“pneumococcal vaccine how often?” is not). 
General English imposes many restric- 
tions on the formation of sentences, e.g. every 
sentence requires a verb, and count nouns 
(like cough) require an article (e.g., a or the). 
Clinical language, in contrast, is often tele- 
graphic, relaxing many of these restrictions 
of the general language to achieve a highly 
compact form. For example, clinical language 
allows all of the following as sentences: the 
cough worsened, cough worsening, and cough. 
Because the community widely uses and 
accepts these alternate forms, they are not 
considered ungrammatical, but constitute a 
sublanguage (Friedman et al. 2004). 


a Representation of Syntactic Knowledge 
Phrases and sentences can be represented as 
a sequence where each word is coupled with 
its corresponding part of speech as shown in 
O Fig. 8.7. For example, Severe joint pain can 
be represented as Severe/adjective joint/noun 
pain/noun. Formalisms that can be used 
to represent syntactic linguistic knowledge 
include probabilistic context free grammars 
(Jurafsky and Martin 2019), which, along 
with dependency formalisms are widely used 
in language processing. 

Dependency is a binary asymmetrical rela- 
tion between a head and its dependents or 
modifiers. The head of a sentence is usually a 
tensed verb. Thus dependency structures are 
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O Fig.8.7 A syntactic parse tree for the sentence “The 
patient had pain in lower extremities” according to the 
context-free grammar is shown above the sentence. 
Notice that the terminal nodes in the tree correspond to 
the syntactic categories of the words in the sentence. The 
parse tree below the sentence, is in a dependency gram- 
mar framework 


basically directed relations between words. 
For example, in the sentence, “The patient 
had pain in lower extremities”, the head of 
the sentence is the verb “had”, which has two 
arguments, a subject noun “patient” and an 
object noun “extremities”, that modifies or is 
dependent on “patient”, while “in” is depen- 
ae on “pain”, “extremities” is dependent on 

>, and “lower” is dependent on “extremi- 
ties”. As such, in a dependency grammar, the 
relations among words and the concept of 
head in particular (e.g. “extremities” is the 
head of “lower”) is closer to the semantics of 
a sentence. 


= Semantics 
Semantics concerns the meaning or interpre- 
tation of words, phrases and sentences, gen- 
erally associated with real-world applications. 
There are many different theories for repre- 
sentation of meaning, such as logic-based 
(e.g., first order logic and lambda calculus), 
frame-based, conceptual graph formalisms, 
and distributional semantics, i.e., vector rep- 
resentations learned from data (Jurafsky and 
Martin 2019). 

Each word has one or more meanings — 
word senses (e.g. resistance, as in psycho- 
logical resistance, social resistance, multidrug 
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resistance, and capillary resistance), and other 
terms may modify the senses (e.g. no, as in 
no fever, or last week as in fever last week). 
Recognizing polysemy or ambiguity of the 
word is important. Complementary to the 
phenomenon of polysemy, there are often 
terms that are different variations of the same 
concepts (synonymy). For instance, the term 
blood sugar is often used by health consum- 
ers to refer to a glucose measurement, but it is 
used rarely if ever in the clinical literature or 
in clinical notes. 

Additionally, the meanings of the words 
combine to form a meaningful sentence, as in 
“there was thickening in the renal capsule”). 
Representation of the semantics of general 
language is extremely important, but the 
underlying concepts are not as clear or uni- 
form as those concerning syntax. Interpreting 
the meaning of words and text in general is 
very challenging. In biomedical informat- 
ics, interpreting the meaning of text focuses, 
largely, on entity linking (i.e., representing a 
word or group of words with a unique seman- 
tic concept from a relatively small number of 
well-defined semantic types, e.g. medication, 
gene, disease, body part, or organism.) The 
semantics of phrases and sentences is also 
restricted to a smaller set of patterns than 
in general language (e.g. medication-treats- 
disease, gene-interacts-with-gene). 

There are often several ways to express a 
particular medical concept as well as numer- 
ous ways to express modifiers for that concept. 
For example, ways to express severity include 
faint, mild, borderline, 1+, 3rd degree, severe, 
extensive, and moderate. Often, to complicate 
matters, modifiers can be composed or nested. 
For instance, in the phrase “no improvement 
in pneumonia,” improvement is a change 
modifier that modifies the concept pneumo- 
nia, and no is a negation marker that modi- 
fies improvement (not pneumonia). Complex 
semantic structures containing nesting can be 
represented using a semantic grammar, which 
is a context free grammar based on seman- 
tic categories. An alternative representation 
would facilitate processing by flattening the 
nesting. In this case, some information may 
be lost but ideally only information that is 


not critical. For example, slightly improved 
may not be clinically different from improved. 
Since this type of information is fuzzy and 
imprecise, the loss of information may not be 
significant. However, the loss of a negation 
modifier would be significant. Another such 
example concerns hedging, which frequently 
occurs in radiology reports as well as in the 
scientific articles. Implementing a semantic 
grammar would require a large corpus that 
has been annotated with both syntactic and 
semantic information. Since a semantic gram- 
mar is domain and/or application specific, 
annotation involving the phrase structure 
would be costly and not portable, and there- 
fore is not generally done. 


8.4.4 Document-Level 
Representations 


As is the case for text in general, documents 
within the biomedical domain are expected 
to have a certain structure. Text or discourse 
structure refers to the way in which authors 
organize information within documents. The 
organization of biomedical text is reflective of 
both the type of information being conveyed 
as well as its intended audience. The structure 
of biomedical text aids in its comprehension 
by the reader, and can be utilized in perform- 
ing various natural language processing tasks. 
The structure of biomedical text can be exam- 
ined at a local or global level. The local level 
primarily concerns coherence and cohesion, 
aspects of structure that connect text and give 
it meaning, whereas the global level concerns 
aspects of the overall organization and rhe- 
torical structure of a document, such as its 
sectioning. 

The cohesive devices responsible for the 
local structure of text play an important role 
in comprehension of biomedical documents. 
The recognition of phenomena such as coor- 
dinating constructions, anaphora, as well as 
ellipsis is needed to represent biomedical text 
at the document level. In this section, we exem- 
plify discourse processing with automated 
resolution of referential expressions and then 
discuss pragmatics, which concerns everything 
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extraneous to the text that contributes to its 
meaning and context. In clinical and health- 
consumers language processing, the context is 
sometimes readily accessible and sometimes 
easily extractable from datasets, e.g., medica- 
tion orders, who wrote the text, at what time, 
etc. Interaction patterns can be inferred from 
the datasets as well. 


8.4.4.1 Automated Resolution 
of Referential Expressions 
Determining which words or phrases in a text 
referring to the same entity, called coreference 
resolution can draw on both syntactic and 
semantic information in the text. 
Syntactic information for resolving refer- 
ential expressions includes: 
= Agreement of syntactic features between 
the referential phrase and potential 
referents 
= Recency of potential referents (nearness to 
referential phrase) 
= Syntactic position of potential referents 
(e.g. subject, direct object, object of 
preposition) 
= The pattern of transitions of topics across 
the sentences 


Syntactic features that aid the resolution 
include such distinctions as singular/plural, 
animate/inanimate, and subjective/objective/ 
possessive. For instance, the inanimate pro- 
noun “it” usually refers to things, but some- 
times does not refer to anything when it 
occurs in cleftconstructions, such as “it was 
noted”, “it was decided to” and “it seemed 
likely that”. 

Referential expressions are usually very 
close to their referents in the text. The syn- 
tactic position of a potential referent is an 
important factor. For example, a referent in 
the subject position is a more likely candi- 
date than the direct object, which in turn is 
more likely than an object of a preposition. 
Centering theory accounts for reference by 
noting how the center (focus of attention) of 
each sentence changes across the discourse 
(Grosz et al. 1995). In this approach, resolu- 
tion rules attempt to minimize the number 
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of changes in centers. Semantic information 
for resolving referential expressions involves 
consideration of the semantic type of the 
expression and the way it relates to potential 
referents (Hahn et al. 1999; Kilicoglu 2016). 


8.4.5 Pragmatics 


Pragmatics concerns how the intent of the 
author of the text, or, more generally, the con- 
text in which the text is written, influences the 
meaning of a sentence or a text. For example, 
in a mammography report “mass” generally 
denotes breast mass, whereas a radiological 
report of the chest denotes mass in lung. In 
yet a different genre of texts, like a religious 
journal, it is likely to denote a ceremony. 
Similarly, in a health care setting “he drinks 
heavily” is assumed to be referring to alcohol 
and not water. In these two examples, prag- 
matics influences the meaning of individual 
words. It can also influence the meaning of 
larger linguistic units. For instance, when 
physicians document the chief-complaint sec- 
tion of a note, they list symptoms and signs, 
as reported by the patient. The presence of a 
particular symptom, however, does not imply 
that the patient actually has the symptom. 
Rather, it is understood implicitly by both the 
author of the note and its reader that this is 
the patient’s impression rather than the truth. 
Thus, the meaning of the chief-complaint 
section of a note is quite different from the 
assessment and plan, for instance. 

Another pragmatic consideration is the 
interpretation of pronouns and other ref- 
erential expressions (there, tomorrow). For 
example, in the two following sentences “An 
infiltrate was noted in right upper lobe. It was 
patchy”, the pronoun “it” refers to “infiltrate” 
and not “lobe”. In a sentence containing the 
term “tomorrow”, it would be necessary to 
know when the note was written in order to 
interpret the actual date denoted by “tomor- 
row”. As mentioned above, in the biomedical 
domain, pragmatics can be encoded through 
the semantic lexicon and rules about the dis- 
course of a text. 
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85 Practical Considerations 

The recent years see a steadily growing 
demand for biomedical language process- 
ing. The traditionally manual tasks, such 
as assigning medical billing codes for reim- 
bursement, indexing biomedical literature, 
populating biological knowledge bases, and 
providing evidence for clinical decision sup- 
port are assisted by information extraction 
and classification tools. With few exceptions, 
such tools are not yet widely used, and the 
need for them exceeds their supply. 

When embarking on adding natural lan- 
guage processing to the workflow, practitio- 
ners and users often ask where to start and 
if there are any tools, corpora, and resources 
available. These excellent questions should 
be asked, but with the exception of few well- 
established resources, we refrain from point- 
ing to specific collections because the field is 
extremely active and fast moving. The ques- 
tions, therefore, should be answered at the 
time of need using a literature search, includ- 
ing searching PubMed, a widely used and 
growing collection of citations in biomedi- 
cal literature and PubmedCentral, a growing 
collection of open access full text biomedical 
articles. 

Independently of the specific task, cor- 
pora and tools, any NLP endeavor starts with 
dealing with raw data, which entails dealing 
with file formats, character sets and machine 
settings. Bird et al. (2019) discuss the practical 
considerations of working with unstructured 
text in Python. 

Specific biomedical software toolkits for 
many tasks, such as named entity recogni- 
tion, introduced in » Sect. 8.2.1, or modal- 
ity detection, are freely available and widely 
used. Many of the existing approaches are 
built using the open domain tools, such as 
NLTK (Bird et al.), OpenNLP (openNLP) 
and UIMA (UIMA). Many solutions to spe- 
cific problems leverage the existing tools to 
build pipelines that include the existing tools, 
e.g. MetaMap for NER, combined with the 
local implementation of the task-specific 
algorithms. 


8.5.1 Patient Privacy and Ethical 
Concerns 


As an NLP system deals with patient infor- 
mation, its designers must remain cognizant 
of the privacy and ethical concerns entailed 
in handling protected health information. In 
the clinical domain for instance, the Health 
Insurance Portability and Accountability Act 
(HIPAA) regulates the protection of patient- 
sensitive information (see ® Chap. 12 for a 
detailed description of privacy matters in 
the clinical domain). Online, patients provide 
much information about their own health in 
blogs and online communities. While there 
are no regulations in place concerning online 
patient-provided information, researchers 
have established guidelines for the ethical 
study and processing of patient-generated 
speech (Eysenbach and Till 2001). 

The somewhat opposing needs for large 
amounts of data for NLP processing and 
protecting patients’ privacy, led to devel- 
opment of de-identification and anony- 
mization tools (Meystre et al. 2010). Many 
researchers are exploring transfer learning 
(Ruder 2019), where the tools are trained on 
openly available data, e.g., general domain 
or veterinary, or synthetically generated 
life-like data and then fine-tuned on the 
small amounts of task and domain specific 
data (Wu et al. 2019). 


8.5.2 Good System Performance 


If the output of an NLP system is to be 
used to help manage and improve the qual- 
ity of healthcare and to facilitate research, 
it must have high enough performance for 
the intended application. Evidently, different 
applications require varying levels of perfor- 
mance, and the desired level of performance 
needs to be discussed with the intended users 
of the system. While discussing a system’s 
performance, it is important to make sure the 
users understand the benefits and the limita- 
tions of the approach and have reasonable 
expectations with respect to the results, prob- 
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ability and nature of errors, and potential 
requirements for curation. An example appli- 
cation is finding and ranking patients with 
respect to eligibility for a cohort. Depending 
on the nature of the cohort, the users might 
want to see only the patients for which infor- 
mation was extracted with high accuracy, 
e.g., for a retrospective study on a large data- 
set, or all patients that might fit the criteria, 
e.g., identifying patients at risk for disease 
exacerbation. These pragmatic questions are 
external to NLP processing, but still need 
to be answered to optimize the models and 
approaches as needed. 


8.5.3 System Interoperability 


NLP-based systems are often part of larger 
applications. There must be seamless integra- 
tion of the NLP component into its parent 
application. This is equally important in the 
clinical domain, where the system must follow 
standards for interoperability among differ- 
ent health information technology systems, 
such as Health Level 7 (HL7) and the Clinical 
Document Architecture (CDA; see > Chap. 
7), and in processing biomedical literature and 
social media, where the system should be able 
to communicate with the downstream appli- 
cations. For example, information extracted 
automatically to support database curation, 
e.g., model organism databases, should be 
provided to curators within their workflow, 
along with an easy access to the context that 
suggested these terms. 


8.6 Research Considerations 

Biomedical natural language processing has a 
wide range of practical applications. It facili- 
tates clinical and biomedical research, qual- 
ity assurance of clinical care and delivery of 
information to patients. This wide range of 
tasks stimulates an ongoing and constantly 
growing research of foundational principles 
of biomedical language processing. Due to 
the growing interest in literature-based dis- 
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covery, management of big data, secondary 
use of clinical data, clinical decision support, 
and population studies through social media, 
the demand for biomedical language process- 
ing has increased significantly and will con- 
tinue growing. 

Being primarily motivated by the needs 
of the domain, biomedical and clinical NLP 
always was data driven. All NLP research 
starts with exploratory data analysis that 
takes into account the context in which the 
text was created and the context in which it 
is used. Some of the text always needs to be 
annotated to create gold standards for evalu- 
ation, and, depending on the approach that 
is researched, e.g. supervised machine learn- 
ing, large amounts of annotated text might 
be needed to train the models. Annotation 
takes time, effort and money, so leveraging the 
existing annotated collections and approaches 
that allow adding minimal amounts of task- 
specific annotations to improve the results is 
growing in popularity. 

Before an NLP-based system can be used 
for a practical task, it must be evaluated care- 
fully, both intrinsically and extrinsically, in 
a setting where the system will be used. A 
variety of techniques exists for the evalua- 
tion and testing of natural language process- 
ing programs. They vary with respect to cost, 
repeatability, and the kind of information that 
is obtainable from them. In this section, we 
first discuss data annotation and annotation 
guidelines, and then present evaluation prin- 
ciples and approaches. 


8.6.1 Data Annotation 


During the initial task definition and data 
exploration for gold standard construction, 
an annotation schema is established to cap- 
ture the minimal amount of annotations suf- 
ficient to perform the task. At the same time, 
annotation guidelines are created to describe 
the task, the schema and the annotation rules. 
There are ongoing community efforts, starting 
with The Canon group (Evans et al. 1994), to 
create established representations for certain 
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aspects of information, such as the different 
modifiers of concepts and the relations among 
concepts that can occur in texts. Although in 
most cases the specific datasets are still using 
local representations, the de-facto standard is 
standoff annotation that provides offsets of 
the strings and meta-description to represent 
concepts. More variety exists in representing 
relations and events. 

Once the initial schema is established, a 
small number of documents is annotated by 
a group of annotators to determine if the 
schema allows annotating all required enti- 
ties and relations and if the guidelines are 
clear. The next step is to finalize and freeze 
the guidelines, and then annotate the required 
number of documents, ensuring some overlap 
to measure inter-annotator agreement. The 
guidelines can be modified during annota- 
tion for one purpose only: to add information 
about a new case that was not covered by the 
existing rules. © Figure 8.8 illustrates the pro- 
cess of creating a specific corpus of drug labels 
annotated with Adverse Drug Reactions and 
all steps of this annotation effort (Demner- 
Fushman et al. 2018). 


8.6.2 Evaluation 


Evaluating the performance of an NLP sys- 
tem is crucial whether the NLP system targets 
the end-users directly or as a part of a larger 
application. In the biomedical NLP domain, 
evaluation brings together two traditions: 
evaluation in biology and clinical research 
and evaluation of software, both with respect 
to its output and usability in the eyes of the 
intended end-users. 

Biomedical and clinical researchers expect 
health technology assessment to include 
“properties of a medical technology used in 
health care, such as safety, efficacy, feasibil- 
ity, and indications for use, cost, and cost 
effectiveness, as well as social, economic, 
and ethical consequences, whether intended 
or unintended” (IOM 1985). Measuring the 
social, economic, and ethical consequences of 
NLP systems in biomedical domain has not 
been systematically researched. Several studies 
looked into social consequences of delivering 
information to clinicians. For example, over 
500 clinicians interviewed by Lindberg et al. 
used MEDLINE searches to choose the most 
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appropriate test, make the diagnosis, develop 
and implement a treatment plan, maintain 
an effective physician-patient relationship, 
and modify patients’ health behaviors. In 8 
cases, MEDLINE was credited with saving a 
patient’s life, and in another 17 with increasing 
the length of life (Lindberg et al. 1993b). 

The efficacy, feasibility and cost of NLP 
systems and tools, on the other hand, are rela- 
tively easy to measure and these evaluations 
follow the principles of the software evaluation 
tradition. The metrics described below were 
developed to evaluate the software perfor- 
mance using sets of benchmarks independent 
of the tasks for which the tools might be used. 
These intrinsic evaluations measure changes in 
the system’s output caused by changes in the 
system’s parameters, as well as the differences 
between systems that implement different 
algorithms. For example, we can compare dif- 
ferent parsers against an established reference 
standard, such as the Penn Treebank (Taylor 
et al. 2003). Alternatively, in an extrinsic eval- 
uation that measures a method’s performance 
in a given task, we can ask what parser will 
improve the overall performance in a rela- 
tion extraction task. Most of the large-scale 
evaluations (shared tasks) provide venues and 
generate collections that allow evaluating sys- 
tems’ performance in a specific task, e.g., the 
adverse drug reaction collection in © Fig. 8.8 
was used in a Text Analysis Conference evalu- 
ation (Roberts et al. 2017). 


8.6.2.1 Evaluation Metrics 


The most commonly used metrics for evalu- 
ations conducted by computational linguists 
are precision, recall, and F-measure. The clin- 
ical informatics community prefers referring 
to recall as sensitivity, and pairs it with speci- 
ficity and the area under the ROC (Receiver 
Operating Characteristic) curve, if the task 
allows applying these metrics. The above met- 
rics are based on the confusion matrix (or 
error matrix) and are often defined in terms 
of the four cells of the matrix: 
= true positive (tp) — outputs correctly 
labeled as having the characteristics of 
interest, for example, tagging a string as 
gene name 
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= false positive (fp) — outputs incorrectly 
labeled as having the characteristics of 
interest, for example, tagging a string that 
is not a gene name as gene name 

= true negative (tn) — outputs correctly 
labeled as not having the characteristics of 
interest, for example, tagging a string that 
is not a gene name as such 

= false negative (fn) — outputs incorrectly 
labeled as not having the characteristics of 
interest, for example, failing to tag a string 
that is a gene name as gene name 


In many NLP tasks, using metrics based on the 
true negative values is problematic because the 
number of true negatives is not countable. Even 
if we arbitrarily specify what constitutes a true 
negative for this task, the annotation effort for 
the reference set will become even more daunt- 
ing and expensive than the efforts described 
above, and our solution will not solve the prob- 
lem in general. Another reason to avoid mea- 
sures based on true negatives, is the prevalence 
of negative results, for example, gene names 
will constitute a small percentage of an article, 
even in an article describing major pathways. 
The three basic quantitative measures used 
to assess performance in an extrinsic or intrin- 
sic evaluation are calculated as follows: Recall 
is the percentage of results that should have 
been obtained according to the test set that 
actually were obtained by the system: 


Recall = Number of correct results obtained 
by system (TP) / Number of results 
specified in gold standard (TP + FN) 


Precision is the percent of results that the 
system obtained that were actually correct 
according to the test set: 


Precision = Number of correct results 
obtained by system(TP)/ Total number of 
results obtained by system(TP + FP) 


There is usually a tradeoff between recall 
and precision, with higher precision usually 
being attainable at the expense of recall, and 
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vice versa. The F-measure is a combination 
of both measures and can be used to weigh 
the importance of one measure over the other 
by giving more weight to one. If both mea- 
sures are equally important, the F measure 
is the harmonic mean of the two measures. 
When reporting the results, an error analysis 
provides insights into ways to improve a sys- 
tem. This process involves determining rea- 
sons for errors in recall and in precision. In 
an extrinsic evaluation, some errors can be 
due to the NLP system and other errors can 
be due to the subsequent application com- 
ponent. Some NLP errors in recall (i.e. false 
negatives) can be due to failure of the NLP 
system to tokenize the text correctly, to rec- 
ognize a word, to detect a relevant pattern, or 
to interpret the meaning of a word or a struc- 
ture correctly. Some errors in precision can be 
due to errors in interpreting the meaning of 
a word or structure or to loss of important 
information. Errors caused by the application 
component can be due to failure to access the 
extracted information properly or failure of 
the reasoning component. 

Understanding the errors is the first step in 
bringing the NLP applications closer to being 
incorporated in a wider range of biomedical 
and clinical text processing tasks. 
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Directions 


This chapter targets both the students and 
researchers looking for a broad introduction 
to health NLP prior to delving into this active 
field of research, and the informatics practi- 
tioners looking to use NLP for specific tasks 
or types of text. The chapter introduced NLP 
applications and emphasized the critical role 
that the context in which these applications 
are deployed plays when developing NLP 
solutions. It presented the basic computa- 
tional tasks involved in most NLP applica- 
tions and the different linguistic knowledge 
resources and types of linguistic representa- 
tions that can enable and facilitate these basic 
NLP tasks. The chapter listed the practical 
considerations for users of NLP technology 


and the research considerations for moving 
the field forward. 

Although NLP continues to advance 
towards practical applications and more 
NLP methods are used in large-scale real-life 
health information applications, more needs 
to be done to make NLP use in biomedical 
and clinical applications a routine widespread 
reality. Some of the applications described 
in this chapter, are already used in practice, 
e.g., named entity recognition and text label- 
ing are used to support MEDLINE index- 
ers. Some research approaches are already 
outperforming humans on research datasets 
that approximate real-life tasks, for example, 
on reading comprehension tests (SQUAD). 
This does not mean, however, that NLP in 
general and health NLP are solved. In addi- 
tion to improvements in the existing applica- 
tions, new areas are emerging, and some of 
the well-known impediments still need to 
be addressed. The impediments include the 
data access challenges, which are partially 
addressed by synthetic data and transfer 
learning; the lack of interoperability and 
standards, particularly in the evaluation of 
tools included in the clinical workflows. The 
emerging areas and areas of active research 
include, but are not limited to: multi-modal 
data integration, interpretability of machine 
learning results, understanding machine 
learning models bias, and distributed large- 
scale computational models. 


© Suggested Reading 

NLP is a very active field of research in the open 
domain. Many of the applications and tech- 
niques described in this chapter are investi- 
gated in other domains. For a review of NLP 
methods in the general domain, we refer the 
reader to the following textbooks: 

Jurafsky, D., & Martin, J. H. (2019). Speech 
and language processing. An introduction to 
natural language processing, computational 
linguistics and speech recognition. Upper 
Saddle River: Prentice Hall. See a draft of 
the 3-rd edition at https://web.stanford. 
edu/~jurafsky/slp3/. 

Manning, C., & Schütze, H. (1999). 
Foundations of statistical natural language 
processing. Cambridge, MA: MIT Press. 
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This chapter provides a comprehensive and con- 
cise overview of health NLP. For additional 
details and examples see 

Cohen, K. B., & Demner-Fushman, D. (2014). 
Biomedical natural language processing. 
Amsterdam: John Benjamins Publishing. 


Q Questions for Discussion 


1. 


Develop a regular expression to 
regularize the tokens in lines four to 
nine of the following cardiac 
catheterization report (Complications 
through Heart Rate): 


Procedures performed: Right Heart 
Catheterization Pericardiocentesis 
Complications: None 

Medications given during procedure: 
None 

Hemodynamic data 

Height (cm): 180 

Weight (kg): 74.0 

Body surface area (sq. m): 1.93 

Heart rate: 102 

Pressure (nmHg) 

Sys Dias Mean Sat 

RA 14138 

RV 369 12 

PA 44 23 33 62% PCW253021 
Hemoglobin (g/dL): 

Conclusions: Post Operative Cardiac 
Transplant Abnormal Hemodynamics 
Pericardial Effusion 

Successful Pericardiocentesis 

General Comments: 

1600 cc of serosanguinous fluid were 
drained from the pericardial sac with 
improvement in hemodynamics. 


Create a lexicon for the last seven lines 
of the cardiac catheterization report 
above (Conclusions through the last 
sentence). For each word, determine 
all the parts of speech that apply. 
Which words have more than one part 
of speech? Choose eight clinically 
relevant words in that section of the 
report, and suggest appropriate 
semantic categories for them that 
would be consistent with the 
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SNOMED-CT terminology and with 
the UMLS semantic network. 

Draw a parse tree for the last sentence 
of cardiac catheterization report above. 
Draw parse trees for the following sen- 
tences: no increase in temperature; low 
grade fever; marked improvement in 
pain; not breathing. (Hint: some lex- 
emes have more than one word.) 
Identify all the referential expressions 
in the text below and determine the cor- 
rect referent for each. Assume that the 
compute attempts to identify referents 
by finding the most recent noun phrase. 
How well does this resolution rule 
work? Suggest a more effective rule. 


The patient went to receive the AV 
fistula on December 4. However, he 
refuses transfusion. In the operating 
room it was determined upon initial 
incision that there was too much 
edema to successfully complete the 
operation and the incision was closed 
with staples. It was well tolerated by 
the patient. 


In the two following scenarios, an off- 
the-shelf NLP system that identifies 
terms and normalizes them against 
UMLS concepts is applied to a large 
corpus of texts. In the first scenario, the 
corpus consists of patient notes. 
Looking at the frequency of different 
concepts, you notice that there is a large 
number of patients with the concept 
C0019682 (HIV) present, much larger 
than the regular incidence of HIV in 
the population reported in the litera- 
ture. In the second scenario, the corpus 
consists of full-text biology articles. 
Looking at the frequency of different 
concepts, you notice that the failed 
axon connection (fax) gene is one of 
the most frequently mentioned genes in 
your corpus. Describe how you would 
check the validity of these results. For 
both cases, discuss what can explain the 
high frequency counts. 
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The following is an excerpt from a de- 
identified clinical discharge summary 
(as shown in Uzuner et al. (2008)). 


HISTORY OF PRESENT ILLNESS: 
The patient is a 77-year-old woman 
with long standing hypertension who 
presented as a Walk-in to me at the 
[REMOVED] Health Center on 
[REMOVED]. Recently had been 
started q.o.d. on Clonidine since 
[REMOVED] to taper off of the drug. 
Was told to start Zestril 20 mg. q.d. 
again. The patient was sent to the 
[REMOVED] Unit for direct admission 
for cardioversion and anticoagulation, 
with the Cardiologist, Dr. [REMOVED] 
to follow. SOCIAL HISTORY: Lives 
alone, has one daughter living in 
[REMOVED]. Is a non-smoker, and 
does not drink alcohol. HOSPITAL 
COURSE AND TREATMENT: 
During admission, the patient was seen 
by Cardiology, Dr. [REMOVED], was 
started on IV Heparin, Sotalol 40 mg 
PO b.i.d. increased to 80 mg b.i.d., and 
had an echocardiogram. By 
[REMOVED] the patient had better 
rate control and blood pressure control 
but remained in atrial fibrillation. On 
[REMOVED], the patient was felt to be 
medically stable. 


(a) Annotate all elliptical 
constructions and anaphoric 
references. 


(b) Develop an algorithm to identify 
section headings. 


The following is the abstract of the arti- 
cle entitled “Tissue-specific distributions 
of alternatively spliced human 
PECAM-1 isoforms” by Wang et al. (as 
cited by Agarwal and Yu (2009)). 
Annotate each sentence according to the 
four categories: Introduction, Methods, 
Results, and Discussion. 


PECAM-1 plays an important role in 
endothelial cell-cell and cell-matrix 


10. 


interactions, which are essential during 
vasculogenesis and/or angiogenesis. 
Here, we examined expression of 
PECAM-I mRNA in vascular beds of 
various human tissues and compared it 
with expression of PECAM-I in 
human endothelial and hematopoietic 
cells. A short exposure of the blot 
probed with GAPDH is shown, 
because poly(A)+ RNA from the cell 
lines gives a strong signal within several 
hours compared with the total RNA 
from human tissue. Therefore, total 
RNA from various tissues required a 
much longer exposure to reveal 
GAPDH mRNA. Human tissue and 
cell lines expressed multiple RNA 
bands for PECAM-1, which may 
represent alter- natively spliced 
PECAM-I isoforms, the identity of 
which required further analysis. 


Develop a regular expression that is 
capable of differentiating in-text paren- 
thetical citations of the form “(Author, 
Year)” from other parentheticals. 
Manually or programmatically, repeat 
Swansonian literature-based 
discovery ((Swanson 1986), see some 
implementation details in (Ganiz 
et al. 2005)): 


(a) Pick a topic of interest (Raynaud’s 
Disease) 

(b) Search to find 
C = {Raynaud’s} 

(c) Guess that B (e.g., blood factors) 
should be studied in relation to 
Raynaud’s 

(d) Search literature C, = C N blood 

(e) Notice two common descriptors: 
blood viscosity, red blood cell 
rigidity 

(f) Search literature A = {blood viscos- 
ity} U {red blood cell rigidity} 

(g) Notice the term “Fish Oil” 

(h) Search literature A = {Fish Oil} 

(i) Show {Fish Oil} n {Raynaud’s} = Ø 

G) Show plausible connection 
between Raynaud’s and Fish Oil 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= Why are sequence, structure, and bio- 
logical pathway information relevant to 
medicine? 

= Where on the Internet should you look 
fora DNA sequence, a protein sequence, 
or a protein structure? 

= What are two problems encountered in 
analyzing biological sequence, struc- 
ture, and function? 

= How has the age of genomics changed 
the landscape of bioinformatics? 

= What are two computational challenges 
in bioinformatics for the future? 


9.1 The Problem of Handling 
Biological Information 


Bioinformatics is the study of how informa- 
tion is represented and analyzed in biological 
systems, especially information derived at the 
molecular level. Whereas clinical informatics 
deals with the management of information 
related to the delivery of health care, bio- 
informatics focuses on the management of 
information related to the underlying basic 
biological sciences. As such, the two disciplines 
are closely related—more so than generally 
appreciated (see > Chap. 1). Bioinformatics 
and clinical informatics share a concentration 
on systems that are inherently uncertain, diffi- 
cult to measure, and the result of complicated 
interactions among multiple complex com- 
ponents. Both deal with living systems that 
generally lack straight edges and right angles. 
Although reductionist approaches to studying 
these systems can provide valuable lessons, it 
is often necessary to analyze those systems 
using integrative models that are not based 
solely on first principles. Nonetheless, the two 
disciplines approach the patient from oppo- 
site directions. Whereas applications within 
clinical informatics usually are concerned with 
the social systems of medicine, the cognitive 
processes of medicine, and the technologies 
required to understand human physiology, 
bioinformatics is concerned with understand- 
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ing how basic biological systems conspire 
to create molecules, organelles, living cells, 
organs, and entire organisms. Remarkably, 
however, the two disciplines share significant 
methodological elements, so an understand- 
ing of the issues in bioinformatics can be valu- 
able for the student of clinical informatics and 
vice versa. 

The discipline of bioinformatics continues 
to bein a period of rapid growth, because the 
needs for information storage, retrieval, and 
analysis in biology—particularly in molecular 
biology and genomics—have increased dra- 
matically over the past two decades. History 
has shown that scientific developments within 
the basic sciences tend to have a delayed effect 
on clinical care and there is typically a lag of a 
decade before the influence of basic research 
on clinical medicine is realized. It cannot be 
understated the impact that genomics and 
bioinformatic approaches are having in the 
clinic and the point of care. Indeed, chapters 
focusing on “Translational Bioinformatics” 
and “Precision Medicine and Informatics” 
(> Chaps. 28 and 30) describe how these foun- 
dational advances are leading toward impacts 
on human health and improved approaches to 
clinical care. 


9.1.1 Many Sources 
of Biological Data 


There are many sources of information that 
are revolutionizing our understanding of 
human biology and that are creating signifi- 
cant challenges for computational processing. 
New technologies are enabling the miniatur- 
ization of laboratory experiments, increased 
automation of experiments and through 
advanced computer processing, and the inter- 
pretation of data quickly. These technolo- 
gies are producing data at a staggering rate. 
The data produced can interrogate different 
views into the Central Dogma of Biology, the 
metabolome, the metagenome and ancillary 
molecular processes. 

The most dominant new type of informa- 
tion is the sequence information produced by 
genetic studies. This was enabled by the Human 
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Genome Project, an international undertaking 
intended to determine the complete sequence 
of human DNA as it is encoded in each of 
the 23 human chromosomes. The first draft of 
the sequence was published in 2001 (Lander 
et al. 2001) and a final version was announced 
in 2003 coincident with the 50th anniversary 
of the solving of the Watson and Crick struc- 
ture of the DNA double helix. The sequence 
continues to be revised and refined and now 
the sequence the genomes of many differ- 
ent individuals have been realized. Initially, 
the 1000 genomes consortium provided 
>1000 genomes of healthy individuals (1000 
Genomes Consortium, 2010), and now datas- 
ets exist with >100,000 genomes of individu- 
als with a variety of conditions.! Essentially, 
the entire set of genetically driven events 
from conception through embryonic develop- 
ment, childhood, adulthood, and aging are 
encoded by the DNA blueprints within most 
human cells. Given a complete knowledge of 
these DNA sequences, we are in a position to 
understand these processes at a fundamen- 
tal level and to consider the possible use of 
DNA sequences for diagnosing and treating 
disease. This has led to the application of bio- 
informatics (and other foundational domains) 
as Translational Bioinformatics and Precision 
Medicine Informatics (> Chaps. 28 and 30). 

Additionally, large-scale experimental 
methodologies are used to collect data on 
thousands or millions or more molecules 
simultaneously. Scientists apply these metho- 
dologies longitudinally over time and across a 
wide variety of organisms or within an organ- 
ism to observe the development of various 
physiological phenomena. Technologies give 
us the ability to follow the production and 
degradation of molecules, such as the expres- 
sion (transcription) of large numbers of genes 
simultaneously, the presence of proteins or 
metabolites in a biosample, or the populations 
of microorganisms in a sample. 

The first high throughput experiments 
measured the expression of genes on gene 
expression microarrays (Lashkari et al. 1997). 


1 > https://www.nhlbiwgs.org/ (accessed December 1, 
2018). 


This enabled the study of the expression of 
large numbers of genes with one another (Bai 
and Elledge 1997) and to study multiple varia- 
tions on a genome to explore the implications 
of changes in genome function on human dis- 
ease. This work has led to the field of genom- 
ics, the study of the molecular state of a cell, 
tissue or organism through the state and activ- 
ity of its genome. With technology advance- 
ments, gene expression can now be measured 
by directly sequencing messenger RNA mol- 
ecules in a cell and counting the number of 
copies of that RNA molecule that is observed. 

While some scientists are studying the 
human genome, other researchers are study- 
ing the functions of the genomes of numerous 
other biological organisms, including impor- 
tant model organisms (such as mouse, rat, 
fruit fly and yeast) as well as important human 
pathogens (such as Mycobacterium tuberculo- 
sis or Haemophilus influenzae). The genomes 
of these organisms have been determined, and 
efforts are underway to characterize them. 
These allow two important types of analysis: 
the analysis of mechanisms of pathogenicity 
and the analysis of animal models for human 
disease. In both cases, the functions encoded 
by genomes can be studied, classified, and 
categorized, allowing us to decipher how 
genomes affect human health and disease. 

These ambitious scientific projects are not 
only proceeding at a furious pace, but also 
are often accompanied by another approach 
to biology, which produces another source 
of biomedical information: proteomics, the 
study of the protein gene products of the 
genome—the proteome. Proteomics enables 
researchers to discover the state (quantity 
and configuration) of proteins within an 
organism. These protein states can be corre- 
lated with different physiological conditions, 
including disease states. Some of these protein 
states can be used as identifying markers of 
human disease. Similar approaches are being 
applied to understanding the diversity, con- 
centration levels and functions of non-DNA, 
RNA or protein molecules such as metabo- 
lites through the study of the small molecules 
in the metabolome. 

Using these technologies together, we can 
now study the epigenome, the non-genetic 
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effects that influence genome function. These 
include molecules that directly alter the struc- 
ture of DNA but not its sequence (such as 
DNA methylation) or proteins that bind to 
DNA and affect how that DNA expresses 
genes. Epigenomics gives us a more complete 
picture of how biology functions and what its 
implications are for human health. 

All these technologies, along with the 
genome-sequencing projects, are conspiring 
to produce a volume of biological informa- 
tion that at once contains secrets to age-old 
questions about health and disease and threat- 
ens to overwhelm our current capabilities of 
data analysis. Thus, bioinformatics is becom- 
ing critical for medicine in the twenty-first 
century. 


9.1.2 Implications for Clinical 
Informatics 


The effects of this new biological information 
on clinical medicine and clinical informat- 
ics are still evolving. It is already clear, how- 
ever, that some major changes to medicine 
will have to be accommodated. These efforts 
have emerged as important areas of bio- 
medical informatics that have become their 
own domains, Translational Bioinformatics 

(> Chap. 26) and Precision Medicine and 

Informatics (> Chap. 28) and use of bio- 

technology data is now common in Clinical 

Research Informatics (> Chap. 27). 

1. Genetic information in the medical record. 
With the first set of human genomes now 
available and prices for gene sequencing 
rapidly decreasing, it is now cost-effective 
to consider sequencing every patient 
genome or at least genotyping key sections 
of the genomes and integrating that with 
the medical record. 

2. New diagnostic and prognostic information 
sources. One of the main contributions of 
the genome-sequencing projects (and of the 
associated biological innovations) is that 
we are likely to have unprecedented access 
to new diagnostic and prognostic tools. 
Diagnostically, the genetic markers from a 
patient with an autoimmune disease, or of 
an infectious pathogen within a patient, will 
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be highly specific and sensitive indicators 
of the subtype of disease and of that sub- 
type’s probable responsiveness to different 
therapeutic agents. Several genotype-based 
databases have been developed to identify 
markers that are associated with specific phe- 
notypes and identify how genotype affects a 
patient’s response to therapeutics. Clin Var? 
and The Human Gene Mutation Database 
(HGMD)* both annotate mutations with 
disease phenotype. This resource has become 
invaluable for genetic counselors, basic 
researchers, and clinicians. Additionally, 
the Pharmacogenomics Knowledge Base 
(PharmGKB) collects genetic information 
that is known to affect a patient’s response to 
a drug (more on PharmGKB is described in 
Translational Bioinformatics, >» Chap. 26).* 


. Ethical considerations. One of the critical 


questions facing the genome-sequencing 
and other related projects is “Can genetic 
or other molecular information be mis- 
used?” The answer is certainly yes. With 
knowledge of a complete genome for an 
individual, it may be possible in the future 
to predict the types of disease for which 
that individual is at risk years before the 
disease actually develops. If this informa- 
tion fell into the hands of unscrupulous 
employers or insurance companies, the 
individual might be denied employment or 
coverage due to the likelihood of future dis- 
ease, however distant. There is even debate 
about whether such information should 
be released to a patient even if it could 
be kept confidential. Should a patient be 
informed that he or she is likely to get a 
disease for which there is no treatment? 
What about that patient’s relatives, who 
share genetic information with the patient? 
This is a matter of intense debate, and 
such questions have significant implica- 
tions for what information is collected and 
for how and to whom that information 


> https://www.ncbi.nlm.nih.gov/clinvar/ (accessed 
November 1, 2018). 

> http://www.hgmd.org/ (accessed November 1, 
2018). 

> http://www.pharmgkb.org/ (accessed November 
1, 2018). 
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is disclosed (Durfy 1993). Passage of the 
Genetic Information Nondiscrimination 
Act in 2008 set initial federal guidelines on 
use of genetic information.” Additionally, 
the Personal Genome Project (PGP) has 
been working to define open consent mod- 
els for releasing genetic information.® 
The Clinical Sequencing and Exploratory 
Research Consortium (CSER) has been 
tackling the difficult issues in translation 
of genomic data to the clinic broadly.’ 


9.2 The Rise of Bioinformatics 

A brief review of the biological basis of medi- 
cine will bring into focus the magnitude of 
the revolution in molecular biology and the 
tasks that are created for the discipline of 
bioinformatics. The genetic material that we 
inherit from our parents, that we use for the 
structures and processes of life, and that we 
pass to our children is contained in a sequence 
of chemicals known as deoxyribonucleic acid 
(DNA).® The total collection of DNA for a 
single person or organism is referred to as 
the genome. DNA is a long polymer chemi- 
cal made of four basic subunits. The sequence 
in which these subunits occur in the poly- 
mer distinguishes one DNA molecule from 
another and directs a cell’s production of 
proteins and all other basic cellular processes. 
Genes are discreet units encoded in DNA 
and they are transcribed into ribonucleic 
acid (RNA), which has a composition very 
similar to DNA. Genes are transcribed into 
messenger RNA (mRNA) and a majority of 
mRNA sequences are translated by complex 
macromolecular machines, called ribosomes, 
into protein. Not all RNAs are messengers 


5 > http://www.genome.gov/10002328 (accessed 
November 1, 2018). 
6 > http://www.personalgenomes.org/ (accessed 


November 1, 2018). 

7 » https://cser-consortium.org/ (accessed November 
1, 2018). 

8 If you are not familiar with the basic terminology of 
molecular biology and genetics, reference to an 
introductory textbook in the area would be helpful 
before you read the rest of this chapter. 


for the translation of proteins. Ribosomal 
RNA, for example, is used in the construction 
of the ribosome, the huge molecular engine 
that translates mRNA sequences into protein 
sequences. Additionally, mR NAs can be mod- 
ified through alternative splicing, degradation, 
and formation of secondary structures that 
influence transcriptions. Once expressed, pro- 
teins are frequently modified (e.g. phosphory- 
lated), and these modifications can change the 
function of the protein. This process of DNA 
being transcribed to RNA and RNA being 
translated to protein is commonly referred to 
as the Central Dogma of Biology. 

Understanding the basic building blocks 
of life requires understanding the function of 
genomic sequences, genes, and proteins. When 
are genes expressed? Once genes are transcribed 
and translated into proteins, into what cellular 
compartment are the proteins directed? How 
do the proteins function once there? Do the 
proteins need to be modified in order for them 
to become active? How are the proteins turned 
off? Experimentation and bioinformatics have 
divided the research into several areas, and 
the largest are: (1) DNA and protein sequence 
analysis, (2) macromolecular structure-func- 
tion analysis, (3) gene expression analysis, (4) 
proteomics, (5) metabolomics, (6) metagenom- 
ics, and (5) systems biology. 


Roots of Modern 
Bioinformatics 


9.2.1 


Practitioners of bioinformatics have come 
from many backgrounds, including medicine, 
molecular biology, chemistry, physics, statis- 
tics, mathematics, engineering, and computer 
science. It is difficult to define precisely the 
ways in which this discipline emerged. There 
are, however, two main developments that have 
created opportunities for the use of informa- 
tion technologies in biology. The first is the 
progress in our understanding of how biologi- 
cal molecules are constructed and how they 
perform their functions. This dates back as far 
as the 1930s with the invention of electropho- 
resis, and then in the 1950s with the elucidation 
of the structure of DNA and the subsequent 
sequence of discoveries in the relationships 
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among DNA, RNA, and protein structure. 
The second development has been the paral- 
lel increase in the availability of computing 
power. Starting with mainframe computer 
applications in the 1950s and moving to mod- 
ern workstations, and ‘the Cloud’, there have 
been hosts of biological problems addressed 
with computational methods. 


9.2.2 The Genomics Explosion 


The benefit of the human genome sequence 
to medicine is both in the short and in the 
long term. The short-term benefits lie prin- 
cipally in diagnosis; the availability of 
sequences of normal and variant human 
genes will allow for the rapid identification 
of these genes in any patient (e.g., Babior 
and Matzner 1997). The long-term benefits 
will include a greater understanding of the 
proteins produced from the genome: how the 
proteins interact with drugs; how they mal- 
function in disease states; and how they par- 
ticipate in the control of development, aging, 
and responses to disease. 

The effects of genomics on biology and 
medicine cannot be overstated. We now have 
the ability to measure the activity and func- 
tion of genes within living cells. Genomics 
data and experiments have changed the way 
biologists think about questions fundamen- 
tal to life. Whereas in the past, reductionist 
experiments probed the detailed workings of 
specific genes, we can now assemble those data 
together to build an accurate understanding 
of how cells work. 


9.3 Biology Is Now Data-Driven 

Nearly 30 years ago, the use of computers 
was proving to be useful to the laboratory 
researcher. Today, computers are an essential 
component of modern research. This has led 
to a change in thinking about the role of com- 
puters in biology. Before, they were optional 
tools that could help provide insight to expe- 
rienced and dedicated enthusiasts. Today, 
they are required by most investigators, and 
experimental approaches rely on them as 
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critical elements. This is because advances in 
research methods such as genetic sequenc- 
ing, experimental robotics and microfluid- 
ics, X-ray crystallography, nuclear magnetic 
resonance spectroscopy, cryoelectron micros- 
copy, proteomic mass spectrometry and other 
high throughput experiments have resulted in 
experiments that generate massive amounts 
of data. These data pose new problems for 
basic researchers on how the data are properly 
stored, analyzed, and disseminated. 

The volume of data being produced by 
genomics projects is staggering. There are now 
more than 211 million sequences in GenBank 
comprising more than 285 billion digits. Since 
2008, sequencing has bested Moore’s law (see 
> Chap. 1).° But these data do not stop with 
sequence data: PubMed contains over 28 
million literature citations, the Protein Data 
Bank (PDB) contains three-dimensional 
structural data for over 45,538 distinct protein 
structures, and the Gene Expression Omnibus 
(GEO) contains over 2.8 million arrayed sam- 
ples. These data are of incredible importance 
to biology, and in the following sections we 
introduce and summarize the importance of 
sequences, structures, gene expression experi- 
ments, systems biology, and their computa- 
tional components to medicine. 


9.3.1 Sequences in Biology 


Sequence information (including DNA 
sequences, RNA sequences, and protein 
sequences) is critical in biology: DNA, RNA, 
and protein can be represented as a set of 
sequences of basic building blocks (bases for 
DNA and RNA, amino acids for proteins). 
Computer systems within bioinformatics thus 
must be able to handle biological sequence 
information effectively and efficiently. To 
that end, the bioinformatics community has 
developed central databases to store sequence 
information, data models to represent that 
information and software analysis tools to pro- 
cess sequence data. 


9 » http://www.genome.gov/sequencingcosts/ 
(accessed November 1, 2018). 
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9.3.2 Structures in Biology 


The sequence information mentioned in 
» Sect. 9.3.1 is rapidly becoming inexpensive 
to obtain and easy to store. On the other hand, 
the three-dimensional structure information 
about the proteins, DNA, and RNA is much 
more difficult and expensive to obtain, and 
presents a separate set of analysis challenges. 
Currently, only about 45,000 distinct three- 
dimensional structures of biological mac- 
romolecules are known.'° These models are 
incredibly valuable resources, however, because 
an understanding of structure often yields 
detailed insights about biological function. As 
an example, the structure of the ribosome has 
been determined for several species and con- 
tains more atoms than any other structure to 
date. This structure, because of its size, took 
two decades to solve, and presents a formida- 
ble challenge for functional annotation (Cech 
2000). Yet, the functional information for a 
single structure is dwarfed by the potential for 
comparative genomics analysis between the 
structures from several organisms and from 
varied forms of the functional complex. Since 
the ribosome is ubiquitously required for all 
forms of life these types of comparisons are 
possible. Thus, a wealth of information comes 
from relatively few structures. To address the 
problem of limited structure information, the 
publicly funded structural genomics initiative 
aims to identify all of the common structural 
scaffolds found in nature and to increase the 
number of known structures considerably. In 
the end, it is the physical interactions between 
molecules that determine what happens within 
a cell; thus the more complete the picture, the 
better the functional understanding. In partic- 
ular, understanding the physical properties of 
therapeutic agents is the key to understanding 
how agents interact with their targets within 
the cell (or within an invading organism). 
These are the key questions for structural biol- 
ogy within bioinformatics: 
1. How can we analyze the structures of mol- 
ecules to learn their associated function? 


Approaches range from detailed molecu- 
lar simulations (Levitt 1983) to statistical 
analyses of the structural features that 
may be important for function (Wei and 
Altman 1998). 

2. How can we extend the limited structural 
data by using information in the sequence 
databases about closely related proteins 
from different organisms (or within the 
same organism, but performing a slightly 
different function)? There are signifi- 
cant unanswered questions about how to 
extract maximal value from a relatively 
small set of examples. 

3. How should structures be grouped for the 
purposes of classification? The choices 
range from purely functional criteria 
(“these proteins all digest proteins”) to 
purely structural criteria (“these pro- 
teins all have a toroidal shape”), with 
mixed criteria in between. One interesting 
resource available today is the Structural 
Classification of Proteins (SCOP),!! which 
classifies proteins based on shape and 
function. 


9.3.3 Genome Sequencing Data 
in Biology 


Advances in sequencing technology are piv- 
otal in enabling the practice of genomic 
medicine. Whereas the first human genome 
sequence was carried out over approximately 
13 years at a cost of $2.7 billion (Davies 2010), 
whole human genomes can now be sequenced 
in a matter of days at a cost that is growing 
ever-closer to the magic, if somewhat arbi- 
trary, $1000 price tag. This amount is com- 
monly seen as the price at which it becomes 
feasible to sequence a patient in the course 
of clinical care, justifiable both clinically and 
financially. In 2004, and again in 2011, the 
National Human Genome Research Institute 
(part of the National Institutes of Health) 
funded a number of efforts specifically aimed 


10 For more information see » http://www.rcsb.org/ 
(accessed November 1, 2018). 


11 > http://scop2.mrc-Imb.cam.ac.uk/ (accessed Dece- 
mber 1, 2018). 
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at increasing speed and decreasing the cost of 
genome scale sequencing. 

Traditional sequencing involves a method 
referred to as Sanger sequencing. This method 
typically is applied to sequences ranging from 
300 to 1000 nucleotides in a non-high through- 
put manner.!? In the early to mid 2000s, sev- 
eral technologies were introduced to sequence 
large amounts of DNA in parallel. These high 
throughput sequencing methods (of which there 
are many including sequencing by synthesis, 
single molecule sequencing, combinatorial 
probe anchor synthesis, and others) typically 
involve shorter sequences than Sanger based 
approaches, but can generate gigabases of 
sequence in short fragments at low cost 
(<$0.05 per megabase sequenced). These 
methods are being used for many applications, 
including identification of genetic variants in 
clinical studies, characterizing genome func- 
tion with specific experiments and sequenc- 
ing novel species genomes. These studies have 
already discovered the genetic basis of rare 
genetic disorders by sequencing entire families 
(Ng et al. 2010), and we have seen a glimpse of 
the future of genome sequencing for routine 
health care in the analysis of a single genome 
of a healthy man (Ashley et al. 2010). As will 
be described in detail in the Translational 
Bioinformatics chapter (> Chap. 26), these 
sequencing approaches have been put to prac- 
tice clinically. One emergent area of research 
is metagenomics, the study of microorganism 
ecosystems using DNA sequencing, including 
the association of human gut flora popula- 
tions to disease phenotypes in humans (Qin 
et al. 2010). 


9.3.4 Expression Data in Biology 


The development of DNA microarrays led to 
a wealth of data and unprecedented insight 
into the fundamental biological machine. The 
traditional premise is relatively simple; tens 
of thousands of gene sequences derived from 
genomic data are fixed onto a glass slide or 


12 » http://en.wikipedia.org/wiki/DNA_sequencing 
(accessed November 1, 2018). 
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filter. The sequences for each spot are derived 
from a single gene sequence and the sequences 
are attached at only one end, creating a forest 
of sequences in each spot that are all identi- 
cal. An experiment is performed where two 
samples (e.g. groups of cells that are grown 
in different conditions or for comparisons of 
normal and cancer tissue), one group is a con- 
trol group and the other is the experimental 
group. The control group is grown normally, 
while the experimental group is grown under 
experimental conditions. For example, a 
researcher may be trying to understand how 
a cell compensates for a lack of sugar. The 
experimental cells will be grown with limited 
amounts of sugar. As the sugar depletes, some 
of the cells are removed at specific intervals 
of time. When the cells are removed, all of 
the mRNA from the cells is separated from 
the cells and converted back to DNA, using 
reverse transcriptase (a special enzyme that 
can create a DNA copy from an RNA tem- 
plate). This leaves a pool of cDNA molecules 
(DNA derived from mRNA is called comple- 
mentary DNA or cDNA) that represent the 
genes that were expressed (turned on) in that 
group of cells. In the development of genom- 
ics experimentation, these cDNA molecules 
would be tagged with florescence and hybrid- 
ized to slides containing single stranded DNA 
“probes” that are arrayed in a grid. These 
microarray “chips” can then be analyzed for 
color differences between grid points that cor- 
respond to specific gene regions. Today, with 
the advent of high throughput sequencing 
the RNA/cDNA can be sequenced directly 
to measure expression levels and using DNA 
barcoding technology and microfluidics, indi- 
vidual cells can be sequenced alone instead of 
in pooled samples where all cells’ contribu- 
tions to mRNA is in the same analysis. High 
throughput single cell sequencing is an excit- 
ing advancement which adds orders of com- 
plexity to the required computational analysis 
(Shapiro et al. 2013). 

Computers become critical for analyz- 
ing these data because it is impossible for a 
researcher to measure and analyze all of the 
datasets by hand. Currently scientists are 
using gene expression experiments to study 
how cells from different organisms compen- 
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sate for environmental changes, how patho- 
gens fight antibiotics, and how cells grow 
uncontrollably (as is found in cancer). A chal- 
lenge for biological computing is to develop 
methods to analyze these data, tools to store 
these data, and computer systems to collect 
the data automatically. 


9.3.5 Metabolomics Data in Biology 


Genomics and proteomics study the func- 
tion of the genome and the proteome, while 
metabolomics studies the diversity and func- 
tion of small molecules in a biosample. These 
include metabolites such as lipids, carbohy- 
drates, metal ions, hormones, signaling mol- 
ecules, etc. Interest in the metabolome has 
increased significantly with the development 
of separation and mass spectrometry technol- 
ogies that can identify small molecule molec- 
ular mass and identities in a high throughput 
fashion. Bioinformatics is a key component 
of both the identification of specific mole- 
cules by matching mass spectrometry “finger- 
prints” with a database of known molecules 
as well as in the analysis the resulting data. 
For example, researchers have characterized 
the metabolome of human colorectal can- 
cers and stool and identified disease enriched 
metabolites as a possible detectable markers 
of disease or treatment outcomes (Brown 
et al. 2016). 


9.3.6 Epigenetics Data in Biology 


Epigenetics consists of heritable changes 
that are not encoded in the primary DNA 
sequence. Several types of epigenetic effects 
can now be studied in the laboratory, and 
they have been associated to disease and risks 
of disease (Goldberg et al. 2007). First, the 
regional structure of chromosomes affects 
which regions of the genome can be tran- 
scribed, i.e. which regions can be expressed. 
Large proteins, called histones, coordinate 
the structure of chromosomes and their 
structure and positions are regulated with 
protein posttranslational modifications 
to the histones bound to the DNA. These 


changes have been associated with sponta- 
neous mutations in cancer, complex genetic 
diseases, and Mendelian inherited genetic 
diseases. Second, cytosine bases in the DNA 
can be methylated and this can affect gene 
expression. DNA methylation patterns can 
be passed on when DNA is replicated. Like 
chromosome structure, these modifications 
have been associated with human disease 
(Bird 2002). 


9.3.7 Systems Biology 


Recent advances in high throughput technol- 
ogies have enabled a new, dynamic approach 
to studying biology, that of systems biology. 
In contrast to the historically reductionist 
approach to biology, studying one molecule at 
a time, systems biology looks at the entirety 
of a system including dynamic relationships 
between the different components. With that 
said, systems biology is still maturing. As an 
analogy, consider an airplane. Having a “parts 
list” for a Boeing 747 does not enable us to 
understand how those parts work together 
to make the airplane operate. If the airplane 
breaks, the parts list alone does not tell us 
how to remedy the situation. Rather, we need 
to understand how the parts interact, how 
one affects another, and how perturbations to 
one part of the system affect the rest of the 
system. Similarly, systems biology involves 
understanding not only the “parts list”, i.e. 
the list of all genes, proteins, metabolites, etc., 
but also the dynamic networks of interactions 
among these parts. An integrated simulation 
of an entire bacterial cell has shown the feasi- 
bility of accurate computational simulations 
of cell physiology (Karr et al. 2012). 

Current research in -omics technologies 
have both enabled and catalyzed the advance- 
ment of systems biology. However, a systems 
biology approach goes beyond simply per- 
forming these high bandwidth methods for the 
purpose of biological discovery. Rather, sys- 
tems biology implies a systematic, hypothesis- 
driven approach based on omic-scale (very 
large) hypotheses. Once the interactions in 
a biological network are understood, one 
can model that network to make predictions 
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regarding the system’s behavior, particularly in 
light of specific perturbations. Understanding 
how the system has evolved to work can also 
help us understand what goes wrong when the 
system breaks down, and how to intervene in 
order to restore the system to normal. 


9.4 Key Bioinformatics Algorithms 
There are a number of common computa- 
tions that are performed in many contexts 
within bioinformatics. In general, these com- 
putations can be classified as sequence align- 
ment, structure alignment, pattern analysis of 
sequence/structure, gene expression analysis, 
and pattern analysis of biochemical function. 


9.4.1 Early Work in Sequence 
and Structure Analysis 


As it became clear that the information from 
DNA and protein sequences would be volumi- 
nous and difficult to analyze manually, algo- 
rithms began to appear for automating the 
analysis of sequence information. The first 
requirement was to have a reliable way to align 
sequences so that their detailed similarities and 
distances could be examined directly. Needleman 
and Wunsch (1970) published an elegant method 
for using dynamic programming techniques to 
align sequences in time related to the cube of the 
number of elements in the sequences. Smith and 
Waterman (1981) published refinements of these 
algorithms that allowed for searching both the 
best global alignment of two sequences (aligning 
all the elements of the two sequences) and the 
best local alignment (searching for areas in which 
there are segments of high similarity surrounded 
by regions of low similarity). A key input for 
these algorithms is a matrix that encodes the 
similarity or substitutability of sequence ele- 
ments: When there is an inexact match between 
two elements in an alignment of sequences, it 
specifies how much “partial credit” we should 
give to the overall alignment based on the simi- 
larity of the elements, even though they may 
not be identical. Looking at a set of evolution- 
arily related proteins, Dayhoff (1974) published 
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one of the first matrices derived from a detailed 
analysis of which amino acids (elements) tend to 
substitute for others. 

Within structural biology, the vast com- 
putational requirements of the experimental 
methods (such as X-ray crystallography and 
nuclear magnetic resonance) for determining 
the structure of biological molecules drove 
the development of powerful structural anal- 
ysis tools. In addition to software for ana- 
lyzing experimental data, graphical display 
algorithms allowed biologists to visualize 
these molecules in great detail and facilitated 
the manual analysis of structural principles 
(Langridge 1974; Richardson 1981). At the 
same time, methods were developed for simu- 
lating the forces within these molecules as they 
rotate and vibrate (Gibson and Scheraga 1967; 
Karplus and Weaver 1976; Levitt 1983). 

The most important development to support 
the emergence of bioinformatics, however, has 
been the creation of databases with biological 
information. In the 1970s, structural biologists, 
using the techniques of X-ray crystallography, 
set up the Protein Data Bank (PDB) specifying 
the Cartesian coordinates of the structures that 
they elucidated (as well as associated experimen- 
tal details) and made PDB publicly available. 
The first release, in 1977, contained 77 structures. 
The growth of the database is chronicled on the 
Web: the PDB now has over 75,000 detailed 
atomic structures and is the primary source of 
information about the relationship between pro- 
tein sequence and protein structure.'? Similarly, 
as the ability to obtain the sequence of DNA 
molecules became widespread, the need for a 
database of these sequences arose. In the mid- 
1980s, the GENBANK database was formed as 
a repository of sequence information. Starting 
with 606 sequences and 680,000 bases in 1982, 
the GENBANK has grown by much more than 
135 million sequences and 125 billion bases.'* 
The GENBANK database of DNA sequence 
information supports the experimental recon- 
struction of genomes and acts as a focal point 


13 See » http://www.rcsb.org/ (accessed December 1, 
2018). 

14 > http://www.ncbi.nlm.nih.gov/genbank/ (accessed 
December 1, 2018). 
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for experimental groups. Numerous other data- 
bases store the sequences of protein molecules!° 
and information about human genetic diseases. 16 
Included among the databases that have 
accelerated the development of bioinformatics 
is the Medline database of the biomedical lit- 
erature and its paper-based companion Index 
Medicus (see ® Chap. 23).!7 Including articles 
as far back as 1809 and brought online free on 
the Web in 1997, Medline provides the glue that 
relates many high-level biomedical concepts 
to the low-level molecule, disease, and experi- 
mental methods. In fact, this “glue” role was 
the basis for creating the NCBI suite of data- 
bases and software and PubMed systems (see 
> Sect. 9.5) for integrating access to literature 
references and the associated databases. 


9.4.2 Sequence Alignment 
and Genome Analysis 


Perhaps the most basic activity in computa- 
tional biology is comparing two biological 
sequences to determine (1) whether they are 
similar and (2) how to align them. The prob- 
lem of alignment is not trivial but is based ona 
simple idea. Sequences that perform a similar 
function should, in general, be descendants of 
a common ancestral sequence, with mutations 
over time. These mutations can be replace- 
ments of one amino acid with another, dele- 
tions of amino acids, or insertions of amino 
acids. The goal of sequence alignment is to 
align two sequences so that the evolutionary 
relationship between the sequences becomes 
clear. If two sequences are descended from 
the same ancestor and have not mutated too 
much, then it is often possible to find corre- 
sponding locations in each sequence that play 
the same role in the evolved proteins. The 
problem of solving correct biological align- 
ments is difficult because it requires knowl- 


15 » http://www.uniprot.org/ (accessed December 1, 
2018). 


16 » http://www.ncbi.nlm.nih.gov/omim (accessed 
December 1, 2018). 
17 » http://www.ncbi.nlm.nih.gov/pubmed (accessed 


December 1, 2018). 


edge about the evolution of the molecules that 
we typically do not have. There are now, how- 
ever, well-established algorithms for finding 
the mathematically optimal alignment of two 
sequences. These algorithms require the two 
sequences and a scoring system based on (1) 
exact matches between amino acids that have 
not mutated in the two sequences and can be 
aligned perfectly; (2) partial matches between 
amino acids that have mutated in ways that 
have preserved their overall biophysical prop- 
erties; and (3) gaps in the alignment signifying 
places where one sequence or the other has 
undergone a deletion or insertion of amino 
acids. The algorithms for determining opti- 
mal sequence alignments are based on a tech- 
nique in computer science known as dynamic 
programming and are at the heart of many 
computational biology applications (Gusfield 
1997). @ Figure 9.1 shows an example of a 
Smith-Waterman matrix, the first described 
local alignment algorithm that utilizes a 
dynamic programming approach. The algo- 
rithm works by calculating a similarity matrix 
between two sequences, then finding optimal 
paths through the matrix that maximize a 
similarity score between the two sequences. 

Unfortunately, the dynamic programming 
algorithms are too computationally expensive 
to apply to large numbers of sequences, so a 
number of faster, more heuristic methods have 
been developed. The most popular algorithm 
is the Basic Local Alignment Search Tool 
(BLAST) (Altschul et al. 1990). BLAST is 
based on the observation that sections of pro- 
teins are often conserved without gaps (so the 
gaps can be ignored—a critical simplification 
for speed) and that there are statistical analy- 
ses of the occurrence of small subsequences 
within larger sequences that can be used to 
prune the search for matching sequences 
in a large database. These tools work well 
for both protein and nucleic acid sequences. 
Other tools have been developed that are bet- 
ter suited for nucleic acid sequence assembly 
and mapping of short read high throughput 
sequencing data including BLAT (Kent 2003), 
SOAP (Liet al. 2008), and others. 

Protein 3D structures can be aligned, visu- 
alized and compared in a similar way to lin- 
ear protein sequences (@ Fig. 9.2). Tools such 
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a) Pairwise alignment between human chymotrypsin and human trypsin. 


CTRB_HUMAN MAFLWLLSCWALLGTTFGCGVPAIHPVLSGLSRIVNGEDAVPGSWPWQVSLQDKTGFHFC 
TRY1_HUMAN MNPLLILTFVA- - --------- - AALAAPFDDDDKIVGGYNCEENSVPYQVSLN- - SGFHFC 


CTRB_HUMAN GGSLISEDWVVTAAHCGVRTSDDVVVAGEF DQGSDEENIQVLKIAKVFKNPKFSILTVNND 
TRY1_HUMAN GGSLINEQWVVSAGHC- YKSRIQVRLGEHNIEVLEGNEQFINAAKIIRHPQYDRKTLNND 


CTRB_HUMAN ITLLKLATPARFSQTVSAVCLPSADDDFPAGTLCAT TGWGKTKYNANKTPDKLQQAALPL 
TRY1_HUMAN IMLIKLSSRAVINARVSTISLPTAPP - - ATGTKCLISGWGNTASSGADYPDYPDELQCLDAPV 


CTRB_HUMAN LSNAECKKSWGRRITDVMICAGESASGVSSCMGDSGGPEVCOKDGAWITEVGIVSWGSDTC 
TRY1_HUMAN LSQAKCEASYPGKITSNMF CVGELEGGKDSCOGDSGGPVVCNG = === QLOGVVSWGDGCA 


CTRB_HUMAN STSSPGVYARVTKLIPWVQKILLAN - 
TRY1_HUMAN QKNKPGVYTKVYNYVKWIKNTIAANS 


b) Smith Waterman matrix illustrating the aligned region in A, using the BLOSUM62 
mutation matrix (Henikff and Henikoff, 1994). 
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O Fig.9.1 Example of sequence alignment using the Smith Waterman algorithm 
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O Fig. 9.2 Example of structural visualization and 
comparison. Comparison of the serine protease protein 
structures and catalytic amino acids using Chimera 
(> http://www.cgl.ucsf.edu/chimera; accessed Decem- 
ber 15, 2018) 


as PyMol!® and UCSF Chimera!’ provide 
sophisticated and extensible applications for 
relatively easy visualization of 3D structures. 
Tools for 3D alignment of the structures are 
provided with these applications. 


9.4.3 Prediction of Structure 
and Function from Sequence 


One of the primary challenges in bio- 
informatics is taking a newly determined DNA 
sequence (as well as its translation into a pro- 
tein sequence) and predicting the structure of 
the associated molecules, as well as their func- 
tion. Both problems are difficult, being fraught 
with all the dangers associated with making 
predictions without hard experimental data. 
Nonetheless, the available sequence data are 
starting to be sufficient to allow good predic- 
tions in a few cases. For example, there is a 
Web site devoted to the assessment of biologi- 
cal macromolecular structure prediction meth- 
ods.”° Results suggest that when two protein 
molecules have a high degree (more than 40%) 


18 » https://pymol.org/ (accessed December 1, 2018). 

19 » http://www.cgl.ucsf.edu/chimera/ (accessed 
December 1, 2018). 

20 » http://predictioncenter.org/ (accessed December 
1, 2018). 


of sequence identity and one of the structures 
is known, a reliable model of the other can be 
built by analogy. In the case that sequence sim- 
ilarity is less than 25%, however, performance 
of these methods is much less reliable. 

With the advent of deep learning, there 
has been an acceleration of progress in many 
machine learning tasks, including structure 
prediction. Recently, the use of convolutional 
neural networks by DeepMind Inc. called 
AlphaFold (Senior, et al. 2020) has lead to a 
quantum leap in the quality of predicted struc- 
tures—so much so that some experts in protein 
structure prediction have said that parts of this 
challenge can now be considered “solved?!.” 
They make this claim because on multiple pre- 
diction tasks, the accuracy of the predicted 
structure is similar to those determined exper- 
imentally. Of course, it is likely that there are 
classes of proteins that may not perform as well, 
but for a large fraction of protein sequences, the 
structure seems to be predictable by these meth- 
ods. An important caveat is that these methods 
must be carefully reviewed by the community, 
reproduced and made generally available before 
they will have their full impact 

When scientists investigate biological 
structure, they commonly perform a task 
analogous to sequence alignment, called 
structural alignment. Given two sets of three- 
dimensional coordinates for a set of atoms, 
what is the best way to superimpose them so 
that the similarities and differences between 
the two structures are clear? Such computa- 
tions are useful for determining whether two 
structures share a common ancestry and for 
understanding how the structures’ functions 
have subsequently been refined during evo- 
lution. There are numerous published algo- 
rithms for finding good structural alignments. 
We can apply these algorithms in an auto- 
mated fashion whenever a new structure is 
determined, thereby classifying the new struc- 
ture into one of the protein families. 

There are also algorithms for using the 
structure of a large biomolecule and the struc- 
ture of a small organic molecule (such as a 


21 » https://www.nature.com/articles/d41586-020- 
03348-4. 
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drug or cofactor) to try to predict the ways in 
which the molecules will interact. An under- 
standing of the structural interaction between 
a drug and its target molecule often provides 
critical insight into the drug’s mechanism of 
action. The most reliable way to assess this 
interaction is to use experimental methods to 
solve the structure of a drug-target complex. 
Once again, these experimental approaches 
are expensive, so computational methods play 
an important role. Typically, we can assess the 
physical and chemical features of the drug 
molecule and can use them to find comple- 
mentary regions of the target. For example, 
a highly electronegative drug molecule will be 
most likely to bind in a pocket of the target 
that has electropositive features. 

Prediction of function often relies on use 
of sequential or structural similarity met- 
rics and subsequent assignment of function 
based on similarities to molecules of known 
function. These methods can guess at general 
function for roughly 60-80% of all genes, but 
leave considerable uncertainty about the pre- 
cise functional details even for those genes for 
which there are predictions, and have little to 
say about the remaining genes. 


9.4.4 Clustering of Gene 
Expression Data 


Analysis of gene expression data often begins 
by clustering the expression data. A typical 
experiment is represented as a large table, 
where the rows are the genes on each chip and 
the columns represent the different experi- 
ments, whether they be time points or differ- 
ent experimental conditions. Each row is then 
a vector of values that represent the results of 
the experiment with respect to a specific gene. 
Clustering can then be performed to deter- 
mine which genes are being expressed simi- 
larly. Genes that are associated with similar 
expression profiles are often functionally asso- 
ciated. For example, when a cell is subjected to 
starvation (fasting), ribosomal genes are often 
downregulated in anticipation of lower protein 
production by the cell. It has similarly been 
shown that genes associated with neoplas- 
tic progression could be identified relatively 
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easily with this method, making gene expres- 
sion experiments a powerful assay in cancer 
research (see Yan and Gu 2009, for a review). 
In order to cluster expression data, a distance 
metric must be determined to compare a gene’s 
profile with another gene’s profile. If the vector 
data are a list of values, Euclidian distance or 
correlation distances can be used. If the data 
are more complicated, more sophisticated dis- 
tance metrics may be employed. These meth- 
ods fall into two categories: supervised and 
unsupervised. Supervised learning methods 
require some preconceived knowledge of the 
data at hand (discussed below). Usually, the 
method begins by selecting profiles that rep- 
resent the different groups of data, e.g., genes 
that represent certain pathways, and then the 
clustering method associates each of the genes 
with the representative profile to which they 
are most similar. Unsupervised methods are 
more commonly applied because these meth- 
ods require no knowledge of the data, and can 
be performed automatically. 

Two such unsupervised learning methods 
are the hierarchical and K-means cluster- 
ing methods. Hierarchical methods build a 
dendrogram, or a tree, of the genes based on 
their expression profiles. These methods are 
agglomerative and work by iteratively joining 
close neighbors into a cluster. The first step 
often involves connecting the closest profiles, 
building an average profile of the joined pro- 
files, and repeating until the entire tree is built. 
K-means clustering builds k clusters or groups 
automatically. The algorithm begins by pick- 
ing k representative profiles randomly. Then 
each gene is associated with the representative 
to which it is closest, as defined by the dis- 
tance metric being employed. Then the center 
of mass of each cluster is determined using all 
of the member gene’s profiles. Depending on 
the implementation, either the center of mass 
or the nearest member to it becomes the new 
representative for that cluster. The algorithm 
then iterates until the new center of mass and 
the previous center of mass are within some 
threshold. The result is k groups of genes 
that are regulated similarly. One drawback of 
K-means is that one must chose the value for 
k. If k is too large, logical “true” clusters may 
be split into pieces and if k is too small, there 
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will be clusters that are merged. One way to 
determine whether the chosen k is correct is to 
estimate the average distance from any mem- 
ber profile to the center of mass. By varying 
k, it is best to choose the lowest k where this 
average is minimized for each cluster. Another 
drawback of K-means is that different initial 
conditions can give different results, therefore 
it is often prudent to test the robustness of the 
results by running multiple runs with different 
starting configurations (B Fig. 9.3). 

The future clinical usefulness of these algo- 
rithms cannot be overstated. In 2002, van’t Veer 
et al. (2002) found that a gene expression pro- 
file could predict the clinical outcome of breast 
cancer. The global analysis of gene expression 
showed that some cancers were associated with 
different prognosis, not detectable using tradi- 
tional means. Another exciting advancement 
in this field is the potential use of microarray 
expression data to profile the molecular effects 
of known and potential therapeutic agents. This 
molecular understanding of a disease and its 
treatment will soon help clinicians make more 
informed and accurate treatment choices (for 
more, see > Chap. 26). 


Growth of GenBank 
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O Fig. 9.3 The exponential growth of GEN- 
BANK. This plot shows that since 1982 the number of 
bases in GENBANK has grown by five full orders of 
magnitude and continues to grow by a factor of 10 every 
4 years 


9.4.4.1 Classification and Prediction 


A high level description of some common 
approaches to classification or supervised 
learning are described below, but note that 
entire courses could be, and are, taught on 
each of these methods. For further details we 
refer readers to the suggested texts at the end 
of this chapter. 

One of the simplest methods for clas- 
sification is that of k-nearest-neighbor, or 
KNN. Essentially, KNN uses the classifica- 
tion of the k closest instances to a given input 
as a set of votes regarding how that instance 
should be classified. Unfortunately, KNN 
tends not to be useful for omics-based classifi- 
cation because it tends to break down in high- 
dimensional space. For high-dimensional 
data, KNN has difficulty in finding enough 
neighbors to make prediction, which will 
lead to large variation in the classification. 
This breakdown is one aspect of the “curse 
of dimensionality,” described in more detail 
below (Hastie et al. 2009). 

A more general statistical approach to 
supervised learning, and one which encom- 
passes a number of popular methods, is that 
of function approximation. In this approach, 
one attempts to find a useful approximation of 
the function f(x) that underlies the actual rela- 
tion between the inputs and outputs. In this 
case, one chooses a metric by which to judge 
the accuracy of the approximation, for exam- 
ple the residual sum of squares, and uses this 
metric to optimize the model to fit the training 
data. Bayesian modeling, logistic regression, 
and Support Vector Machines all use varia- 
tions on this approach. 

Finally, there is the class of rule-based clas- 
sifiers. This type of classifier may be thought 
of as a series of rules, each of which splits the 
set of instances based on a given characteris- 
tic. Details such as what criteria are used to 
choose the feature on which to base a rule, 
and whether the algorithm uses enhancements 
such ensemble learning (i.e., multiple models 
together) determine the specifics of the clas- 
sifier type, for example decision trees, random 
forests, or covering rules. 

Which approach to use depends both on 
the nature of the data and the question being 
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asked. The question might prioritize sensitivity 
over specificity or vice versa. For example, for a 
test to detect a life-threatening infection that is 
easily treatable by readily available antibiotics, 
one might want to err on the side of sensitivity. 
In addition, data may be numeric or categori- 
cal or have differing degrees of noise, missing 
values, correlated features or non-linear inter- 
actions among features. These different quali- 
ties are better handled by different methods. In 
many cases the best approach is actually to try 
a number of different methods and to compare 
the results. Such comparative analysis is facili- 
tated through freely available software pack- 
ages such as R/Bioconductor”” and Weka.” 


9.4.5 The Curse of Dimensionality 


In the post-genomic era, there is no shortage 
of data to analyze. Rather, many researchers 
have more data than they know what to do 
with. However this overabundance tends to 
be a factor of the dimensionality of the data, 
rather than the number of subjects. This mis- 
match can lead to challenges for experimental 
design and statistical analysis. Type 1 error, 
or the tendency to incorrectly reject the null 
hypothesis and say that indeed there is statisti- 
cal significance to a pattern (see > Chap. 13), 
is amplified by looking at high-dimensional 
data. This is one aspect of what is known as 
the “curse of dimensionality” (Hastie et al. 
2009). Consider analysis of gene expression 
data for 20,000 genes, trying to detect a pattern 
that can predict outcome. In a sample of, say, 
30 subjects—a reasonable number when test- 
ing a single hypothesis—by random chance, 
some number of genes will correlate with 
outcome. Essentially one is testing not one 
but 20,000 hypotheses simultaneously. One 
must therefore correct for multiple hypoth- 
esis testing. The Bonferroni method is a com- 
mon and straightforward approach to correct 


22 » http://bioconductor.org/ (accessed December 1, 
2018). 

23 » http://www.cs.waikato.ac.nz/ml/weka/ (accessed 
December 1, 2018). 
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for multiple hypothesis testing.’ It entails 
dividing the threshold p-value one would use, 
traditionally 0.05, by the number of hypoth- 
eses. So, for a test of 20,000 genes, one would 
require a p-value of 2.5 x 1076 to call a gene 
significant. Typically, analyses using high 
dimensional data such as gene expression are 
not sufficiently powered to pass this stringent 
test. One would need thousands of samples to 
be sufficiently powered. Another approach is 
to use q-value, or false discovery rate (Storey 
and Tibshirani 2003), rather than p-value. 
This approach relies on empirical permuta- 
tion to determine the expected number of 
false positives if indeed the null hypothesis 
is correct, which enables approximation of 
the proportion of false positives among all 
reported positives. Consider again the micro- 
array experiment above in which each array 
includes 20,000 genes. We want to know 
whether gene X was differentially expressed 
in cases versus controls. Choosing a threshold 
p-value, or false positive rate, of 0.05 means 
that 1 time in 20 we will erroneously reject 
the null hypothesis and predict a false posi- 
tive. If a statistical test returns 2000 positives, 
i.e. 2000 genes appear to be significantly dif- 
ferentially expressed, we expect 1 in 20 of the 
genes being analyzed (20,000 x (1/20) = 1000) 
or approximately half of them to be false 
positives. A false discovery rate of 0.05, on 
the other hand, would mean that 5% of those 
called positive, in this case 100 out of 2000, 
are false positives. Q-value is thus less strin- 
gent than p-value, but may be of greater util- 
ity in a high-dimensional omics context than 
a traditional p-value or correction for multiple 
hypotheses. 

Another approach to analysis of high 
dimensional data sets is to use dimensionality 
reduction methods such as feature selection 
or feature extraction. Feature selection entails 
extracting only a subset of the features at 
hand, in this case genes. This may be done in 
a number of ways, based on which genes vary 
the most, or on which genes seem to best pre- 
dict the categorization at hand. In contrast, 


24 » http://en.wikipedia.org/wiki/Bonferroni_correc- 
tion (accessed December 1, 2018). 
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feature extraction creates a new smaller set of 
features that captures the essence of the origi- 
nal variation. As an example, imagine a plane 
flight from Seattle, WA to Key West, FL. One 
could use a 3-dimensional vector consist- 
ing of latitude and longitude to describe the 
plane’s position at any given point along the 
way. In this case, one value would describe 
how far the plane had gone in the north/south 
direction, and one would indicate how far 
the plane had gone in the east/west direction. 
However, if we change the axis along which 
we are measuring to instead be the direct route 
along which the plane is flying, then we only 
need | dimension to describe where the plane 
is located. The distance flown tells us where 
the plane is located at any given time. This 
approach of changing the axes is the basis for 
principle components analysis (PCA), a com- 
mon method for feature extraction. Instead of 
going from two dimensions to one, PCA on 
gene expression data typically goes from tens 
of thousands of features to just a few. Both 
for feature selection and feature extraction, 
it is important to replicate the findings in an 
independently generated data set in order to 
be sure the model is not over fitting the data 
on which it was trained. 


9.5 Current Application Successes 


from Bioinformatics 


Biologists have embraced the Internet in a 
remarkable way and have made access to data 
a normal and expected mode for doing busi- 
ness. Hundreds of databases curated by indi- 
vidual biologists create a valuable resource 
for the developers of computational methods 
who can use these data to test and refine their 
analysis algorithms. With standard Internet 
search engines, most biological databases can 
be found and accessed within moments. The 
large number of databases has led to the devel- 
opment of meta-databases that combine infor- 
mation from individual databases to shield the 
user from the complex array that exists. There 
are various approaches to this task. 

The National Center for Biotechnology 
Information (NCBI) suite of databases and 
software (previously known as the ‘Entrez’ 


gives integrated access to the biomedical litera- 
ture, protein, and nucleic acid sequences, mac- 
romolecular and small molecular structures, 
and genome project links (including both 
the Human Genome Project and sequenc- 
ing projects that are attempting to determine 
the genome sequences for organisms that are 
either human pathogens or important experi- 
mental model organisms) in a manner that 
takes advantages of either explicit or com- 
puted links between these data resources.” 
Newer technologies are being developed that 
will allow multiple heterogeneous databases 
to be accessed by search engines that can com- 
bine information automatically, thereby pro- 
cessing even more intricate queries requiring 
knowledge from numerous data sources. One 
example is the Bioconductor project, a tool- 
box for bioinformatics in the R programming 
language.”® 


9.5.1 Data Sharing 


In 1996, the First International Strategy 
Meeting on Human Genome Sequencing was 
held in Bermuda. In this meeting, a set of prin- 
ciples was agreed upon regarding sharing of 
human genome sequencing data. These prin- 
ciples came to be known as the Bermuda prin- 
ciples. They stipulated that (1) all sequence 
assemblies larger than 1 kb should be released 
as soon as possible, ideally within 24 h; (2) 
finished annotated sequences should be pub- 
lished immediately to public databases; and 
(3) that all human sequence data generated in 
large-scale sequencing centers should be made 
available in the public domain.”’ 

Increasingly, journals and funders require 
that researchers deposit all types of research 
data in publicly available repositories (Fischer 
and Zigmond 2010). In 2009, President 
Obama announced an Open Government 


25 » https://www.ncbi.nlm.nih.gov/search/ 
December 7th, 2020). 

26 » http://bioconductor.org/ (accessed December 1, 
2018). 

27 » http://www.ornl.gov/sci/techresources/Human_ 
Genome/research/bermuda.shtml (accessed Decem- 
ber 1, 2018). 
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Directive that included plans to make fed- 
erally funded research data available to the 
public.’ This announcement describes the 
NIH’s policy regarding published manuscripts 
in particular, but also notes that the results of 
vgovernment-funded research can take many 
forms, including data sets. Currently the NIH 
requires that proposals for funding of over 
$500,000 include a data sharing plan.” 

To that end, a significant advancement in 
bioinformatics is in making research datasets 
more available and reusable. From the com- 
munity of researchers who are enabling this 
effort the concept of FAIR data has emerged. 
FAIR datasets are Findable, Accessible, 
Interoperable and Reusable. FAIR data 
principles lay out a framework to encourage 
increased sharing and use of scientific data- 
sets. Findable data includes the use of global 
persistent identifiers and metadata standards. 
Accessible data is available on the Internet 
and searchable through metadata usage. 
Interoperable data use a “formal, accessible, 
shared and broadly applicable language for 
knowledge representation”. Finally, reusable 
data have clear attribution and license that 
enables reuse. The webportal FAIRsharing 
provides curated resources on datasets, stan- 
dards and collections that are more FAIR.*° 
Resources such as BioCaddie DataMed 
enable discovery of datasets through a Data 
Discovery Index.*! 


9.5.2 Data Standards, Metadata 
and Biomedical Ontologies 


> Chapter 7 on standards in biomedical 
informatics addresses standardized terminol- 
ogies as well as standards for data exchange, 
and terminologies for translational research 
are discussed in > Chap. 27. The develop- 


28 » http://edocket.access.gpo.gov/2009/E9-29322. 
htm (accessed December 1, 2018). 

29 » http://grants.nih.gov/grants/guide/notice-files/ 
NOT-OD-03-032.html (accessed December 1, 
2018). 

30 » https://fairsharing.org/ (accessed December 1, 
2018). 

31 » https://datamed.org/ (accessed April 20, 2019). 
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ment of such schemes necessitates the cre- 
ation of terminology standards, just as in 
clinical informatics. There are now many con- 
trolled vocabularies (or ontologies) and meta- 
data standards for annotation of genomic 
or proteomic data. Metadata standards help 
define information which should be collected 
and annotated upon various types of datas- 
ets. Furthermore, a great many tools have 
been developed to help researchers access and 
analyze this data. For example, the previously 
mentioned Bioconductor project provides 
bioinformatic tools in the R language for 
solving common problems. Other commonly 
used tools include BioPerl, BioPython and 
MATLAB.” 

Biomedical ontologies have become a key 
component in the development of metadata 
standards for the management and exchange 
of bioinformatic datasets and in making data 
more FAIR (see > Sect. 9.5.1). The open bio- 
medical ontologies consortium (OBO) has 
developed a number of reference ontologies 
that are in wide use in bioinformatics including 
Gene Ontology, Human Phenotype Ontology 
and the UBERON anatomy ontology (Smith 
et al. 2007). For example, Gene Ontology 
(GO) is an ontology used for annotation of 
gene function, and arguably the most widely 
used ontology in basic research. Ontologies 
enable indexing, exchange and computing 
with biomedical datasets and metadata. 

Metadata standards for bioinformat- 
ics datasets are an intellectual challenge for 
researchers to enable the sharing and interop- 
erability of data and to make data more 
FAIR. There are a number of tools and web 
portals such as the Center for Expanded Data 
Annotation and Retrieval (CEDAR) provide 
tools for creation and sharing of metadata 
about datasets.*? Metadata can include infor- 
mation about an experiment such as the pro- 
tocol, the time the experiment was performed, 
who performed the experiment and technology 
used to generate or analyze the experiment, but 


32 » http://www.open-bio.org/ (accessed December 1, 
2018). 

33 » https://metadatacenter.org/ (accessed December 
1, 2018). 
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O Fig. 9.4 The NCBI Gene entry for the digestive 
enzyme chymotrypsin. Basic information about the 
original report is provided, as well as some annotations 


can also include information such as organism, 
disease model, tissue, conditions, etc. 


9.5.2.1 Sequence and Genome 
Databases 

The main types of sequence information 
that must be stored are DNA and protein. 
One of the largest DNA sequence databases 
is GENBANK, which is managed by the 
NCBI.23 GENBANK is growing rapidly as 
genome-sequencing projects feed their data 
(often in an automated procedure) directly 
into the database. © Figure 9.3 shows the 
logarithmic growth of data in GENBANK 
since 1982. NCBI Gene curates some of the 
many genes within GENBANK and presents 
the data in a way that is easy for the researcher 
to use (Ø Fig. 9.4). 

In addition to GENBANK, there are 
numerous special-purpose DNA databases 
for which the curators have taken special care 
to clean, validate, and annotate the data. The 
work required of such curators indicates the 
degree to which raw sequence data must be 
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interpreted cautiously. GENBANK can be 
searched efficiently with a number of algo- 
rithms and is usually the first stop for a scien- 
tist with a new sequence who wonders “Has a 
sequence like this ever been observed before? 
If one has, what is known about it?” There are 
increasing numbers of stories about scientists 
using GENBANK to discover unanticipated 
relationships between DNA sequences, allow- 
ing their research programs to leap ahead 
while taking advantage of information col- 
lected on similar sequences. 

A database that has become very useful 
recently is the University of California Santa 
Cruz Genome Browser’* (@ Fig. 9.5). This 
data set allows users to search for specific 
sequences in the UCSC version of the human 
genome. Powered by the similarity search tool 
BLAT, users can quickly find annotations on 
the human genome that contain their sequence 
of interest. These annotations include known 
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O Fig. 9.5 Screen from the UC Santa Cruz genome 
browser showing the chymotrypsin C gene. The rows in 
the browser show annotations on the gene sequence. 
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human chromosome 15, as if the sequence of a, g, c and 


variations (mutations and SNPs), genes, com- 
parative maps with other organisms, and 
many other important data. 


9.5.3 Structure Databases 


Although sequence information is obtained 
relatively easily, structural information 
remains expensive on a per-entry basis. The 
experimental protocols used to determine 
precise molecular structural coordinates 
are expensive in time, materials, and human 
power. Therefore, we have only a small num- 
ber of structures for all the molecules char- 
acterized in the sequence databases. The two 
main sources of structural information are the 
Cambridge Structural Database? for small 
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t are represented from left to right (5-3). The annota- 
tions include gene predictions and annotations as well 
as an alignment of the similarity of this region of the 
genome when compared with the mouse genome 


molecules (usually less than 100 atoms) and 
the PDB*° for macromolecules (see > Sect. 
9.3.2), including proteins and nucleic acids, 
and combinations of these macromolecules 
with small molecules (such as drugs, cofac- 
tors, and vitamins). The PDB has approxi- 
mately 75,000 high-resolution structures, but 
this number is misleading because many of 
them are small variants on the same struc- 
tural architecture. There are approximately 
100,000 proteins in humans; therefore, many 
structures remain unsolved (e.g., Burley and 
Bonanno 2002). In the PDB, each structure is 
reported with its biological source, reference 
information, manual annotations of interest- 
ing features, and the Cartesian coordinates of 
each atom within the molecule. Given knowl- 
edge of the three-dimensional structure of 


36 » https://www.rcsb.org/ (accessed December 18, 
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molecules, the function sometimes becomes 
clear. For example, the ways in which the med- 
ication methotrexate interacts with its biologi- 
cal target have been studied in detail for two 
decades. Methotrexate is used to treat cancer 
and rheumatologic diseases, and it is an inhib- 
itor of the protein dihydrofolate reductase, an 
important molecule for cellular reproduction. 
The three-dimensional structure of dihydro- 
folate reductase has been known for many 
years and has thus allowed detailed studies 
of the ways in which small molecules, such as 
methotrexate, interact at an atomic level. As 
the PDB increases in size, it becomes impor- 
tant to have organizing principles for thinking 
about biological structure. SCOP2?” provides 
a Classification based on the overall structural 
features of proteins. It is a useful method for 
accessing the entries of the PDB. 


9.5.4 Analysis of Biological 
Pathways and Understanding 
of Disease Processes 


The ECOCYC project is an example of a com- 
putational resource that has comprehensive 
information about biochemical pathways. 
ECOCYC is a knowledge base of the meta- 
bolic capabilities of E. coli; it has a repre- 
sentation of all the enzymes in the ŒE. coli 
genome and of the chemical compounds 
those enzymes transform.°® It also links these 
enzymes to their genes, and genes are mapped 
to the genome sequence. 

EcoCyc also encodes the genetic regula- 
tory network of E. coli, describing all protein 
and RNA regulators of E. coli genes. The 
network of pathways within ECOCYC pro- 
vides an excellent substrate on which useful 
applications can be built. For example, they 
provide: (1) the ability to guess the function 
of a new protein by assessing its similarity to 
E. coli genes with a similar sequence, (2) the 
ability to ask what the effect on an organism 
would be if a critical component of a path- 


37 » http://scop2.mrc-lmb.cam.ac.uk/ (accessed Dece- 
mber 15, 2018). 
38 » http://ecocyc.org/ (accessed December 15, 2018). 


way were removed (would other pathways be 
used to create the desired function, or would 
the organism lose a vital function and die?), 
and (3) the ability to provide a rich user inter- 
face to the literature on E. coli metabolism. 
Similarly, the Kyoto Encyclopedia of Genes 
and Genomes (KEGG) provides pathway 
datasets for organism genomes.°? 


9.5.5 Integrative Databases 


A integrative database is a postgenomic data- 
base that bridges the gap between molecular 
biological databases with those of clinical 
importance. One excellent example of a post- 
genomic database is the Online Mendelian 
Inheritance in Man (OMIM) database, which 
is a compilation of known human genes and 
genetic diseases, along with manual annota- 
tions describing the state of our understanding 
of individual genetic disorders.“ Each entry 
contains links to special-purpose databases and 
thus provides links between clinical syndromes 
and basic molecular mechanisms (@ Fig. 9.6). 


9.6 Future Challenges 
as Bioinformatics and Clinical 


Informatics Converge 


Bioinformatics didn’t solve all of its problems 
with the sequencing of the human genome. 
There is a series of challenges for which 
the completion of the first human genome 
sequence is only the beginning. 


9.6.1 Linkage of Molecular 
Information with Symptoms, 
Signs, and Patients 


There is currently a gap in our understand- 
ing of disease processes. Although we have 
a good understanding of the principles by 
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Townes (1969) reported a 3.5-year-old female with generalized anasarca, hypoproteinemia, and 
congestive heart failure. A combined proteolytic and lipolytic defect was found. Activities of trypsin, 
chymotrypsin, carboxypeptidase, and lipase were completely absent. Activation studies proved 
negative. Striking improvement accompanied feeding of protein hydrolysate (Townes, 1972). The 
child also had an imperforate anus, a point of interest because a patient with trypsinogen deficiency 
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O Fig. 9.6 Screen from the Online Mendelian Inheri- 
tance in Man (OMIM) database showing an entry for 
pancreatic insufficiency, an autosomal recessive disease 


which small groups of molecules interact, we 
are not able to explain fully how thousands of 
molecules interact within a cell to create both 
normal and abnormal physiological states. As 
the databases continue to accumulate infor- 
mation ranging from patient-specific data to 
fundamental genetic information, a major 
challenge is creating the conceptual links 
among these databases to create an audit trail 
from molecular-level information to macro- 
scopic phenomena, as manifested in disease. 
The availability of these links will facilitate 
the identification of important targets for 
future research and will provide a scaffold for 
biomedical knowledge, ensuring that impor- 
tant literature is not lost within the increasing 
volume of published data. 


9.6.2 Computational 
Representations 
of the Biomedical Literature 


An important opportunity within bioinfor- 
matics is the linkage of biological experimen- 
tal data with the published papers that report 
them. Electronic publication of the biologi- 
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in which chymotrypsin (NCBI Gene entry shown in 
Ø Fig. 9.2) is totally absent (as are some other key 
digestive enzymes). (Courtesy of NCBI) 


cal literature provides exciting opportunities 
for making data easily available to scientists. 
Already, certain types of simple data that 
are produced in large volumes are expected 
to be included in manuscripts submitted for 
publication, including new sequences that are 
required to be deposited in GENBANK and 
new structure coordinates that are deposited 
in the PDB. However, there are many other 
experimental data sources that are currently 
difficult to provide in a standardized way, 
either because the data are more intricate than 
those stored in GENBANK or PDB or they 
are not produced in a volume sufficient to fill a 
database devoted entirely to the relevant area. 
Knowledge base technology can be used, 
however, to represent multiple types of highly 
interrelated data. 

Knowledge bases can be defined in many 
ways (see ® Chap. 24); for our purposes, we 
can think of them as databases in which (1) 
the ratio of the number of tables to the num- 
ber of entries per table is high compared with 
usual databases, (2) the individual entries (or 
records) have unique names, and (3) the values 
of many fields for one record in the database 
are the names of other records, thus creating 
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a highly interlinked network of concepts. The 
structure of knowledge bases often leads to 
unique strategies for storage and retrieval of 
their content. To build a knowledge base for 
storing information from biological experi- 
ments, there are some requirements. First, 
the set of experiments to be modeled must 
be defined. Second, the key attributes of each 
experiment that should be recorded in the 
knowledge base must be specified. Third, the 
set of legal values for each attribute must be 
specified, usually by creating a controlled ter- 
minology for basic data or by specifying the 
types of knowledge-based entries that can 
serve as values within the knowledge base. 


9.6.3 Computational Challenges 
with an Increasing Deluge 
of Biomedical Data 


An increasing challenge in biomedicine is stor- 
ing, interpreting and integrating the massive 
amount of datasets the biomedical commu- 
nity is generating, largely from modern tech- 
nologies in high throughput experimentation. 
The amount of DNA sequence data being 
generated over time has dwarfed Moore’s 
Law, for example. This issue is important for 
all areas of biomedical informatics, and is dis- 
cussed in more detail in the on Translational 
Bioinformatics (> Chap. 26). 


9.7 Conclusion 

Bioinformatics is closely allied to transla- 
tional and clinical informatics. It differs in its 
emphasis on a reductionist view of biologi- 
cal systems, starting with sequence informa- 
tion and moving to structural and functional 
information. The emergence of the genome 
sequencing projects and the new technolo- 
gies for measuring metabolic processes within 
cells is beginning to allow bioinformaticians 
to construct a more synthetic view of bio- 
logical processes, which will complement the 
whole-organism, top-down approach of clini- 
cal informatics. More importantly, there are 
technologies that can be shared between bio- 


informatics and clinical informatics because 
they both focus on representing, storing, 
and analyzing biological or biomedical data. 
These technologies include the creation and 
management of standard terminologies and 
data representations, the integration of het- 
erogeneous databases, the organization and 
searching of the biomedical literature, the use 
of machine learning techniques to extract new 
knowledge, the simulation of biological pro- 
cesses, and the creation of knowledge-based 
systems to support advanced practitioners in 
the two fields. 


© Suggested Readings 

Altman, R. B., Dunker, A. K., Hunter, L., & Klein, 
T. E. (2003). Pacific symposium on 
Biocomputing’03. Singapore: World Scientific 
Publishing. The proceedings of one of the prin- 
cipal meetings in bioinformatics, this is an 
excellent source for up-to-date research reports. 
Other important meetings include those spon- 
sored by the International Society for 
Computational Biology (ISCB, http://www. 
iscb.org/), Intelligent Systems for Molecular 
Biology (ISMB,  http:/liscb.org/conferences. 
shtml.35), and the RECOMB meetings on 
computational biology _(http://www.ctw- 
congress.de/recomb/). ISMB and PSB have 
their proceedings indexed in PubMed. 

Baldi, P., & Brunak, S. (2001). Bioinformatics: 
The machine learning approach. Cambridge, 
MA: MIT Press. This introduction to the field 
of bioinformatics focuses on the use of statis- 
tical and artificial intelligence techniques in 
machine learning. 

Baldi, P., & Hatfield, G. W. (2002). DNA microar- 
rays and gene expression. Cambridge: 
Cambridge University Press. Introduces the 
different microarray technologies and how 
they are analyzed. 

Berg, J. M., Tymoczko, J. L., & Stryer, L. (2010). 
Biochemistry. New York: W.H. Freeman. The 
textbook by Stryer and colleagues is well writ- 
ten, and is illustrated and updated on a regu- 
lar basis. It provides an excellent introduction 
to basic molecular biology and biochemistry. 

Durbin, R., Eddy, S. R., Krogh, A., & Mitchison, 
G. (1998). Biological sequence analysis: 
Probabilistic models of proteins and nucleic 
acids. Cambridge: Cambridge University Press. 


Bioinformatics 


This edited volume provides an excellent intro- 
duction to the use of probabilistic representa- 
tions of sequences for the purposes of 
alignment, multiple alignment, and analysis. 

Gusfield, D. (1997). Algorithms on strings, trees 
and sequences: Computer science and compu- 
tational biology. Cambridge: Cambridge 
University Press. Gusfield’s text provides an 
excellent introduction to the algorithmics of 
sequence and string analysis, with special atten- 
tion paid to biological sequence analysis prob- 
lems. 

Malcolm, S., & Goodship, J. (Eds.). (2007). 
Genotype to phenotype (2nd ed.). Oxford: 
BIOS Scientific Publishers. This volume illus- 
trates the different efforts to understand how 
diseases are linked to genes. 

Pevsner, P. (2009). Bioinformatics and functional 
genomics. Hoboken: Wiley. A widely used 
excellent introduction to bioinformatics algo- 
rithms. 


® Questions for Discussion 

1. How are DNA and protein sequence 
information changing the way that med- 
ical records are managed? Which types 
of systems are or will be most affected 
(laboratory, radiology, admission and 
discharge, financial, order entry)? 

2. It has been postulated that clinical infor- 
matics and bioinformatics are working 
on the same problems, but in some areas 
one field has made more progress than 
the other. Identify three common themes. 
Describe how the issues are approached 
by each subdiscipline. 

3. Why should an awareness of bioinfor- 
matics be expected of clinical informatics 
professionals? Should a chapter on bioin- 
formatics appear in a clinical informatics 
textbook? Explain your answers. 

4. Why should an awareness of clinical 
informatics be expected of bioinformat- 
ics professionals? Should a chapter on 
clinical informatics appear in a bioinfor- 
matics textbook? Explain your answers. 

5. One major problem with introducing 
computers into clinical medicine is the 
extreme time and resource pressure 
placed on physicians and other health 
care workers. Do you think that the same 
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problems are arising in basic biomedical 
research? 

6. Why have biologists and bioinformati- 
cians embraced the Web as a vehicle for 
disseminating data so quickly, whereas 
clinicians and clinical informaticians 
have been more hesitant to put their pri- 
mary data online? 

7. If a patient’s entire genome were present 
in their medical record how would one go 
about interpreting it clinically? Similarly, 
if we had an entire electronic health 
record database that included human 
genomes, how would a researcher go 
about finding new or novel genetic asso- 
ciations? 

8. With the many high throughput 
experiments that are used in biomedical 
research, how are some ways to integrate 
those datasets using systems biology? 
For example, if you had a microarray 
dataset that annotated gene expression 
levels and a proteomics dataset that 
identified protein interactions, how 
could you jointly use both datasets to 
identify markers for a disease? 
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© Learning Objectives 
After reading this chapter, you should know 
the answers to these questions: 


k 


10. 


1i; 


What makes images a challenging type 
of data to be processed by computers 
when compared to non-image clinical 
data? 

Why are there many different imag- 
ing modalities, and by what major two 
characteristics do they differ? 

How are visual and knowledge content 
in images represented computational- 
ly? How are these techniques similar to 
representation of non-image biomedi- 
cal data? 

What sort of applications can be de- 
veloped to make use of the semantic 
image content made accessible using 
the Annotation and Image Markup 
model? 

What are four different types of im- 
age processing methods? Why are such 
methods assembled into a pipeline 
when creating imaging applications? 
What is an imaging modality with high 
spatial resolution? What is a modality 
that provides functional information? 
Why are most imaging modalities not 
capable of providing both? 

What is the goal in performing segmen- 
tation in image analysis? Why is there 
more than one segmentation method? 
What are the main segmentation meth- 
ods and what are their limitations? 
Should deep learning be always used 
as a first choice as its performance is 
relatively high? 

What are two types of quantitative 
information in images? What are two 
types of semantic information in im- 
ages? How might this information be 
used in medical applications? 

What is the difference between image 
registration and image fusion? What 
are examples of each? 

Can medical image analysis methods 
replace physicians who interpret im- 
ages, or should their role to serve as 
adjunct tools to assist their image 
interpretations? 


10.1 Introduction 


Imaging plays a central role in the healthcare 
process. The field is crucial not only to health 
care, but also to medical communication and 
education, as well as in research. In fact much 
of our recent progress, particularly in diagno- 
sis, can be traced to the availability of increas- 
ingly sophisticated imaging techniques that 
not only show the structure of the body in 
incredible detail, but also show the function 
of the tissues within the body. 

Although there are many types (or modali- 
ties) of imaging equipment, the images the 
modalities produce are nearly always acquired 
in or converted to digital form. The evolution 
of imaging from analog, film-based acqui- 
sition to digital format has been driven by 
the necessities of cost reduction, efficient 
throughput, and workflow in managing and 
viewing an increasing proliferation in the 
number of images produced per imaging pro- 
cedure (currently hundreds or even thousands 
of images). At the same time, having images in 
digital format makes them amenable to image 
processing methodologies for enhancement, 
analysis, display, storage, and even enhanced 
interpretation. 

Because of the ubiquity of images in bio- 
medicine, the increasing availability of images 
in digital form, the rise of high-powered 
computer hardware and networks, and the 
commonality of image processing solutions, 
digital images have become a core data type 
that must be considered in many biomedi- 
cal informatics applications. Therefore, this 
chapter is devoted to a basic understanding 
of the unique aspects of images as a core data 
type and the unique aspects of imaging from 
an informatics perspective. » Chapter 22, on 
the other hand, describes the use of images 
and image processing in various applications, 
particularly those in radiology since that 
field places the greatest demands on imaging 
methods. 

The topics covered by this chapter and 
> Chap. 22 comprise the growing discipline of 
biomedical imaging informatics (Kulikowski 
1997), a subfield of biomedical informatics 
(see > Chap. 1) that has arisen in recognition 
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Representation 


Image Interpretation and 
Computer Reasoning 


O Fig. 10.1 The major topics in biomedical imaging 
informatics follow a workflow of activities and tasks 
commencing with include image acquisition, followed 


of the common issues that pertain to all image 
modalities and applications once the images 
are converted to digital form. 

Biomedical imaging informatics is a 
dynamic field, recently evolving from primar- 
ily focusing on image processing to broader 
informatics topics such as representing and 
processing the semantic contents (Rubin and 
Napel 2010) and integrating image data with 
other types of data (Scheckenbach et al. 2017; 
Pujara et al. 2018; Valdora et al. 2018; Weaver 
and Leung 2018). At the same time, imaging 
informatics shares common methodologies 
and challenges with other domains in bio- 
medical informatics. By trying to understand 
these common issues, we can develop general 
solutions that can be applied to all images, 
regardless of the source. 

The major topics in biomedical imaging 
informatics include image acquisition, image 
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Image 
Processing 


by image content representation, management/storage 
of images, image processing, and image interpretation/ 
computer reasoning 


content representation, management/storage 
of images, image processing, and image inter- 
pretation/computer reasoning (@ Fig. 10.1). 
Image acquisition is the process of generat- 
ing images from the modality and converting 
them to digital form if they are not intrinsi- 
cally digital. Image content representation 
makes the information in images accessible 
to machines for processing. Image manage- 
ment/storage includes methods for storing, 
transmitting, displaying, retrieving, and orga- 
nizing images. Image processing comprises 
methods to enhance, segment, visualize, fuse, 
or analyze the images. Image interpretation/ 
computer reasoning is the process by which 
the individual viewing the image renders an 
impression of the medical significance of the 
results of imaging study, potentially aided by 
computer methods. ® Chapter 22 is primar- 
ily concerned with information systems for 
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image management and storage, whereas this 
chapter concentrates on these other core top- 
ics in biomedical imaging informatics. 

An important concept when thinking 
about imaging from an informatics perspec- 
tive is that images are an unstructured data 
type; though they are readily understood and 
interpreted by knowledgeable human experts, 
their contents are not readily machine under- 
standing except at the granular pixel level. As 
such, while machines can readily manage the 
raw image data in terms of storage/retrieval, 
they cannot easily access image contents (rec- 
ognize the type of image, annotations made 
on the image, or anatomy or abnormalities 
within the image), except for newer deep learn- 
ing methods (» Sect. 10.4.5). In this regard, 
biomedical imaging informatics shares much 
in common with natural language processing 
(NLP; » Chap. 8). In fact, as the methods of 
computationally representing and processing 
images is presented in this chapter, parallels 
to NLP should be considered, since there is 
synergy from an informatics perspective. 

As in NLP, a major purpose of the meth- 
ods of imaging informatics is to extract par- 
ticular information; in biomedical informatics 
the goal is often to extract information about 
the structure of the body and to collect fea- 
tures that will be useful for characterizing 
abnormalities based on morphological altera- 
tions. In fact, imaging provides detailed and 
diverse information very useful for character- 
izing disease, providing an “imaging pheno- 
type” useful for characterizing disease, since 
“a picture is worth a thousand words,!” and 
the informatics methods for capturing imag- 
ing phenotypes complement the informat- 
ics methods that are now being applied to 
electronic medical records data to capture 
“electronic phenotypes” of diseased patients. 
However, to overcome the challenges posed 
by the unstructured image data type, recent 
work is applying semantic methods from 
biomedical informatics to images to make 
their content explicit for machine processing 


1 Frederick Barnard, “One look is worth a thousand 
words,” Printers’ Ink, December, 1921. 


(Rubin and Napel 2010), as well as process- 
ing entire images to learn certain semantic 
image content (Hosny et al. 2018; Yamashita 
et al. 2018). Many of the topics in this chapter 
therefore involve how to represent, extract and 
characterize the information that is present in 
images, such as anatomy and abnormalities. 
Once that task is completed, useful applica- 
tions that process the image contents can be 
developed, such as image search and decision 
support to assist with image interpretation. 

While we seek generality in discussing bio- 
medical imaging informatics, many examples 
in this chapter are taken from a few selected 
domains such as brain imaging, which is 
part of the growing field of neuroinformat- 
ics (Koslow and Huerta 1997). Though our 
examples are specific, we attempt to describe 
the topics in generic terms so that the reader 
can recognize parallels to other imaging 
domains and applications. 


10.2 Image Acquisition 


In general, there are two different strategies 
in imaging the body: (1) delineate anatomic 
structure (anatomic/structural imaging), and 
(2) determine tissue composition or function 
(functional imaging) (@ Fig. 10.2). In real- 
ity, one does not choose between anatomic 
and functional imaging; many modalities pro- 
vide information about both morphology and 
function. However, in general, each imaging 
modality is characterized primarily as being 
able to render high-resolution images with 
good contrast resolution (anatomic imaging) 
or to render images that depict tissue function 
(functional imaging). 


Anatomic (Structural) 
Imaging 


10.2.1 


Imaging the structure of the body has been 
and continues to be the major application of 
medical imaging, although, as described in 
> Sect. 10.2.2, functional imaging is a very 
active area of research. The goal of anatomic 
imaging is to accurately depict the structure 
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Spatial resolution (anatomic detail) 
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Functional information (tissue composition) 


O Fig. 10.2 The various radiology imaging methods 
differ according to two major axes of information of 
images, spatial resolution (anatomic detail) and func- 


of the body—the size and shape of organs— 
and to visualize abnormalities clearly. Since 
the goal in anatomic imaging is to depict and 
understand the structure of anatomic entities 
accurately, high spatial resolution is an impor- 
tant requirement of the imaging method 
(0 Fig. 10.2). Conversely, in anatomic imag- 
ing, recognizing tissue function (e.g., tissue 
ischemia, neoplasm, inflammation, etc.) is not 
the goal, though this is crucial to functional 
imaging and to patient diagnosis. In most 
cases, imaging will be done using a combina- 
tion of methods or modalities to derive both 
structural/anatomic information as well as 
functional information. 


10.2.2 Functional Imaging 


Many imaging techniques not only show the 
structure of the body, but also the function, 
where for imaging purposes function can be 
inferred by observing changes of structure 
over time. In recent years this ability to image 
function has greatly accelerated. For example, 
ultrasound and angiography are widely used 


tional information depicted (which represents the tissue 
composition—e.g., normal or abnormal). A sample of 
the more common imaging modalities is shown 


to show the functioning of the heart by depict- 
ing wall motion, and ultrasound doppler 
can image both normal and disturbed blood 
flow (Mehta et al. 2000). Molecular imaging 
(> Sect. 10.2.3) is increasingly able to depict 
the expression of particular genes superim- 
posed on structural images, and thus can also 
be seen as a form of functional imaging. 

A particularly important application of 
functional imaging is for understanding the 
cognitive activity in the brain. It is now rou- 
tinely possible to put a normal subject in a 
scanner, to give the person a cognitive task, 
such as counting or object recognition, and 
to observe which parts of the brain light 
up. This unprecedented ability to observe 
the functioning of the living brain opens up 
entirely new avenues for exploring how the 
brain works. 

Functional brain imaging modalities can 
be classified as image-based or non-image 
based. In both cases it is taken as axiomatic 
that the functional data must be mapped to the 
individual subject’s anatomy, where the anat- 
omy is extracted from structural images using 
techniques described in the previous sections. 
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Once mapped to anatomy, the functional data 
can be integrated with other functional data 
from the same subject, and with functional 
data from other subjects whose anatomy has 
been related to a template or probabilistic 
atlas. Techniques for generating, mapping and 
integrating functional data are part of the 
field of Functional Brain Mapping, which has 
become very active in the past few years, with 
several conferences (Organization for Human 
Brain Mapping 2001) and journals (Fox 2001; 
Toga et al. 2001) devoted to the subject. 


a= Image-Based Functional Brain Imaging 
Image-based functional data generally come 
from scanners that generate relatively low- 
resolution volume arrays depicting spatially- 
localized activation. For example, positron 
emission tomography (PET) (Heiss and Phelps 
1983; Aine 1995; Alberini et al. 2011) and 
magnetic resonance spectroscopy (MRS) 
(Ross and Bluml 2001) reveal the uptake of 
various metabolic products by the function- 
ing brain; and functional magnetic resonance 
imaging (fMRI) reveals changes in blood oxy- 
genation that occur following neural activity 
(Aine 1995). The raw intensity values gener- 
ated by these techniques must be processed 
by sophisticated statistical algorithms to sort 
out how much of the observed intensity is due 
to cognitive activity and how much is due to 
background noise. 

As an example, one approach to fMRI 
imaging is language mapping (Corina et al. 
2000). The subject is placed in the magnetic 
resonance imaging (MRI) scanner and told to 
silently name objects shown at 3-second inter- 
vals on a head-mounted display. The actual 
objects (“on” state) are alternated with non- 
sense objects (“off” state), and the fMRI sig- 
nal is measured during both the on and the off 
states. Essentially the voxel values at the off 
(or control) state are subtracted from those at 
the on state. The difference values are tested 
for significant difference from non-activated 
areas, then expressed as t-values. The voxel 
array of t-values can be displayed as an image. 

A large number of alternative methods 
have been and are being developed for acquir- 


ing and analyzing functional data (Frackowiak 
et al. 1997). The output of most of these tech- 
niques is a low-resolution 3-D image volume 
in which each voxel value is a measure of the 
amount of activation for a given task. The 
low-resolution volume is then mapped to 
anatomy guided by a high-resolution struc- 
tural MR dataset, using one of registration 
techniques described in » Sect. 10.4.7. 

Many of these and other techniques are 
implemented in the SPM program (Friston 
et al. 1995), the AFNI program (Cox 1996), 
the Lyngby toolkit (Hansen et al. 1999), and 
several commercial programs such as Medex 
(Sensor Systems Inc. 2001) and Brain Voyager 
(Brain Innovation B.V. 2001). The FisWidgets 
project at the University of Pittsburgh is 
developing an approach that allows custom- 
ized creation of graphical user interfaces in 
an integrated desktop environment (Cohen 
2001). A similar effort (VoxBox) is underway 
at the University of Pennsylvania (Kimborg 
and Aguirre 2002). 

The ultimate goal of functional neuroim- 
aging is to observe the actual electrical activ- 
ity of the neurons as they perform various 
cognitive tasks. fMRI, MRS and PET do not 
directly record electrical activity. Rather, they 
record the results of electrical activity, such 
as (in the case of fMRI) the oxygenation of 
blood supplying the active neurons. Thus, 
there is a delay from the time of activity to the 
measured response. In other words these tech- 
niques have relatively poor temporal resolution 
(> Sect. 10.2.4). Electro-encephalography 
(EEG) or magnetoencephalography (MEG), 
on the other hand, are more direct measures 
of electrical activity since they measure the 
electromagnetic fields generated by the electri- 
cal activity of the neurons. Current EEG and 
MEG methods involve the use of large arrays 
of scalp sensors, the output of which are pro- 
cessed in a similar way to CT in order to local- 
ize the source of the electrical activity inside 
the brain. In general this “source localization 
problem” is under-constrained, so informa- 
tion about brain anatomy obtained from MRI 
is used to provide further constraints (George 
et al. 1995). 
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10.2.3 Imaging Modalities 


There are many different approaches that 
have been developed to acquire images of the 
body. A proliferation in imaging modalities 
reflects the fact that there is no single imaging 
technique that satisfies all the desiderata for 
depicting the broad variety of types of pathol- 
ogy. Some abnormalities are better seen on 
some modalities than on others. The primary 
difference among the imaging modalities is 
the type of energy source used to generate 
the images. In radiology, nearly every type of 
energy in the electromagnetic spectrum has 
been used, in addition to other physical phe- 
nomena such as sound and heat. We describe 
the more common methods according to the 
type of energy used to create the image. 


a Light 

The earliest medical images used visible 
light to create photographs, either of gross 
anatomic structures and skin lesions or, if a 
microscope was used, of histological speci- 
mens. Light is still an important source for 
creation of images, and in fact optical imaging 
has seen a resurgence of interest and appli- 
cation for areas such as molecular imaging 
(Weissleder and Mahmood 2001; Ray 2011) 
and imaging of brain activity on the exposed 
surface of the cerebral cortex (Pouratian et al. 
2003). Visible light is the basis for dermato- 
logical imaging (Katragadda, Finnane et al. 
2016), retinal imaging (Panwar et al. 2016), 
and a newer modality called “optical imag- 
ing” that has promising applications such as 
cancer imaging (Solomon et al. 2011). Visible 
light, however, does not allow us to see more 
than a short distance beneath the surface of 
the body; thus other modalities are used for 
imaging structures deep inside the body. 


= Radiography 

X-rays were first discovered in 1895 by Wilhelm 
Conrad Roentgen, who was awarded the 1901 
Nobel Prize in Physics for this achievement. 
The discovery caused worldwide excitement, 
especially in the field of medicine; by 1900, 
there already were several medical radiologi- 
cal societies. Thus, the foundation was laid for 


O Fig. 10.3 A radiograph of the chest (Chest X-ray) 
taken in the frontal projection. The image is shown as if 
the patient is facing the viewer. This patient has abnor- 
mal density in the left lower lobe 


a new branch of medicine devoted to imaging 
the structure and function of the body (Kevles 
1997). 

Radiography (colloquially called “X-ray”) 
is still the primary modality used in radiol- 
ogy departments today, both to record a static 
image (@ Fig. 10.3) as well as to produce a 
real-time view of the patient (fluoroscopy) 
or a movie (cine). Both film and fluoroscopic 
analog screens were used initially for record- 
ing radiology images, but the fluoroscopic 
images very faint and required dark adap- 
tion (radiologists wore red goggles during the 
daytime to maximally sensitize their vision). 
By the 1940s, however, television and image- 
intensifier technologies were developed to 
produce clear real-time fluorescent images. 
Fluoroscopic examinations commonly com- 
bine real-time video monitoring of fluoro- 
scopic images with the creation of selected 
higher resolution images. 

Radiography is a projection technique; 
an X-ray beam—one form of ionizing radia- 
tion—is projected from an X-ray source 
through a patient’s body (or other object) 
onto an X-ray array detector (a specially 
coated cassette that is scanned by a computer 
to capture the image in digital form), or film 
(to produce an non-digital image). Because 
an X-ray beam is differentially absorbed by 
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the various body tissues based on the thick- 
ness and atomic number of the tissues, the 
X-rays produce varying degrees of brightness 
and darkness on the radiographic image. The 
differential amounts of brightness and dark- 
ness on the image are referred to as “image 
contrast;” differential contrast among struc- 
tures on the image is the basis for recognizing 
anatomic structures. Since the image in radi- 
ography is a projection, radiographs show a 
superposition of all the structures traversed 
by the X-ray beam. Much of the art (and 
difficulty) in interpretation of radiographs 
is understanding the imaging patterns that 
result from these superimposed structured 
and how to differentiate pathologies from 
normal structures or artifacts. 

Radiographic images have very high spatial 
resolution because a high photon flux is used 
to produce the images, and a high resolution 
detector (film or digital image array) that cap- 
tures many line pairs per unit area is used. On 
the other hand, since the contrast in images is 
due to differences in tissue density and atomic 
number, the amount of functional informa- 
tion that can be derived from radiographic 
images is limited (@ Fig. 10.2). Radiography 
is also limited by relatively poor contrast reso- 
lution (compared with other modalities such 
as computed tomography (CT) or magnetic 
resonance imaging (MRI) described below), 
their use of ionizing radiation, the challenge 
of spatial localization due to projection ambi- 
guity, and their limited ability to depict physi- 
ological function. As described below, newer 
imaging modalities have been developed to 
increase contrast resolution, to eliminate the 
need for X-rays, and to improve spatial local- 
ization. A benefit of radiographic images is 
that they can be generated in real time (fluo- 
roscopy) and can be produced using portable 
devices. 

Digital radiography (DR) is an imaging 
technique that directly creates digital radio- 
graphs from the imaging procedure. Storage 
phosphor replaces film by substituting a 
reusable phosphor plate in a standard film 
cassette. The exposed plate is processed by a 
reader system that scans the image into digital 
form, erases the plate, and packages the cas- 
sette for reuse. An important advantage of 


CR systems is that the cassettes are of stan- 
dard size, so they can be used in any equip- 
ment that holds film-based cassettes (Hori 
1996). More recently, digital radiography 
uses charge-coupled device (CCD) arrays to 
capture the image directly. Currently, nearly 
all radiology departments no longer acquire 
radiographic images on film (analog images) 
and instead use digital radiography (Korner 
et al. 2007) to acquire digital images. This 
evolution was driven by the cost of film and 
technological advances in digital image acqui- 
sition detectors and monitors whose resolu- 
tion approached that of film. At the same 
time, digitization of radiology drove the evo- 
lution of methods of imaging informatics we 
describe in this chapter. 

Computed Tomography (CT) is an impor- 
tant imaging method that uses X-rays to pro- 
duce cross sectional and volumetric images of 
the body (Lee 2006). Similar to radiography, 
X-rays are projected through the body onto 
an array of detectors; however, the beam and 
detectors rotate around the patient, making 
numerous views at different angles of rota- 
tion. Using computer reconstruction algo- 
rithms, an estimate of absolute density at each 
point (volume element or voxel) in the body is 
computed. Thus, the CT image is a computed 
image (B Fig. 10.4); CT did not become prac- 
tical for generating high quality images until 
the advent of powerful computers and devel- 
opment of computer-based reconstruction 
techniques, which represent one of the most 


O Fig.10.4 A CT image of the upper chest. CT images 
are slices of a body plane; in this case, a cross sectional 
(axial) image of the chest. Axial images are viewed from 
below the patient, so that the patient’s left is on viewer’s 
right. This image shows a cancer mass in the left upper 
lobe of the lung 
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spectacular applications of computers in all 
of medicine (Buxton 2009). The spatial reso- 
lution of images is not as high in CT as it is in 
radiography, due to the computed nature of 
the images; however, the contrast resolution 
and ability to derive functional information 
of tissues in the body are superior for CT than 
for radiography (@ Fig. 10.2). 


= Ultrasound 

A common energy source used to pro- 
duce images is ultrasound, which developed 
from research performed by the Navy dur- 
ing World War II in which sonar was used 
to locate objects of interest in the ocean. 
Ultrasonography uses pulses of high-fre- 
quency sound waves rather than ionizing 
radiation to image body structures (Kremkau 
2006). The basis of image generation is due 
to a property of all objects called acoustical 
impedance. As sound waves encounter differ- 
ent types of tissues in a patient’s body (par- 
ticularly interfaces where there is a chance in 
acoustical impedance), a portion of the wave 
is reflected and a portion of the sound beam 
(which is now attenuated) continues to tra- 
verse into deeper tissues. The time required 
for the echo to return is proportional to the 
distance into the body at which it is reflected; 
the amplitude (intensity) of a returning echo 
depends on the acoustical properties of the 
tissues encountered and is represented in the 
image as brightness (more echoes returning to 
the source is shown as image brightness). The 
system constructs two-dimensional images 


O Fig.10.5 An ultrasound 
image of abdomen. Like CT and 
MRI, ultrasound images are 
slices of a body, but because a 
user creates the images by 
holding a probe, any arbitrary 
plane can be imaged (so long as 
the probe can be oriented to 
produce that plane). This image 
shows an axial slice through the 
pancreas, and flow in nearby 
blood vessels (in color) is seen 
due to Doppler effects 
incorporated into the imaging 
method 
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(B-scans) by displaying the echoes from pulses 
of multiple adjacent one-dimensional paths 
(A-scans). Current ultrasound machines 
are essentially specialized computers with 
attached peripherals, with active development 
of three-dimensional imaging. The ultra- 
sound transducer now often sweeps out a 3-D 
volume rather than a 2-D plane, and the data 
are written directly into a three-dimensional 
array memory, which is displayed using vol- 
ume or surface-based rendering techniques 
(Ritchie et al. 1996). 

Ultrasound images are acquired as digi- 
tal images from the outset. They may also be 
recorded as frames in rapid succession (cine 
loops) for real-time imaging. Ultrasound 
imaging captures not only structural informa- 
tion but also functional information. Doppler 
methods in ultrasound are used to measure 
and characterize the blood flow in blood ves- 
sels in the body (@ Fig. 10.5). More recently, 
ultrasound techniques called elastography 
have been developed to measure tissue stiff- 
ness, which is improving the ability of ultra- 
sound to diagnose a variety of pathology 
conditions such as liver fibrosis (Pawlus et al. 
2015; Zaleska-Dorobisz et al. 2015). The low 
cost of ultrasound and the fact it doesn’t use 
ionizing radiation makes it very attractive asa 
primary modality for imaging worldwide, par- 
ticularly for obstetrical and pediatric imaging. 

Since the image contrast in ultrasound is 
based on differences in the acoustic imped- 
ance of tissue, ultrasound provides functional 
information (e.g., tissue composition and 
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blood flow). On the other hand, the flux of 
sound waves is not as dense as the photon flux 
used to produce images in radiography; thus 
ultrasound images are generally lower reso- 
lution images than other imaging modalities 
(B Fig. 10.2). 


= Magnetic Resonance Imaging (MRI) 
Creation of images from the resonance phe- 
nomena of unpaired spinning charges in a 
magnetic field grew out of nuclear magnetic 
resonance (NMR) spectroscopy, a technique 
that has long been used in chemistry to char- 
acterize chemical compounds. Many atomic 
nuclei within the body have a net magnetic 
moment, so they act like tiny magnets. When a 
small chemical sample is placed in an intense, 
uniform magnetic field, these nuclei line up 
in the direction of the field, spinning around 
the axis of the field with a frequency depen- 
dent on the type of nucleus, on the surround- 
ing environment, and on the strength of the 
magnetic field. 

If a radio pulse of a particular frequency 
is then applied at right angles to the station- 
ary magnetic field, those nuclei with rotation 
frequency equal to that of the radiofrequency 
pulse resonate with the pulse and absorb 
energy. The higher energy state causes the 
nuclei to change their orientation with respect 
to the fixed magnetic field. When the radio- 
frequency pulse is removed, the nuclei return 
to their original aligned state (a process called 
“relaxation”), emitting a detectable radiofre- 
quency signal as they do so. Characteristic 
parameters of this signal—such as intensity, 
duration, and frequency shift away from the 
original pulse—are dependent on the density 
and environment of the nuclei. In the case 
of traditional NMR spectroscopy, different 
molecular environments cause different fre- 
quency shifts (called chemical shifts), which 
we can use to identify the particular com- 
pounds in a sample. In the original NMR 
method, however, the signal is not localized to 
a specific region of the sample, so it is not pos- 
sible to create an image. 

Creation of medical images from NMR 
signals, known as Magnetic Resonance 
Imaging (MRI), came about shortly after fast 
computer-based reconstruction techniques, 


O Fig. 10.6 An MRI image of the knee. Like CT, MRI 
images are slices of a body. This image is in the saggital 
plane through the mid knee, showing in a tear in the pos- 
terior cruciate ligament (arrow) 


similar to CT, were developed. The basis of 
image formation in MRI is based on proton 
relaxation (referred to as T1 and T2 relax- 
ation); differences in T1 and T2 are inherent 
properties of tissue and they vary among tis- 
sues. Thus, MRI provides detailed functional 
information about tissue and can be valuable 
in clinical diagnosis (@ Fig. 10.6). At the 
same time, the flux of radiofrequency waves 
used to produce the images is high, and MRI 
thus has high spatial resolution (@ Fig. 10.2). 

Many new modalities are being developed 
based on magnetic resonance. For example, 
magnetic resonance arteriography (MRA) 
and venography (MRV) are used to image 
blood flow (Lee 2003) and diffusion tensor 
imaging (DTI) is increasingly being used to 
image white matter fiber tracts in the brain 
(Le Bihan et al. 2001; Hasan et al. 2010; de 
Figueiredo et al. 2011; Gerstner and Sorensen 
2011). More recently, a technique called MRI 
elastography has been developed to measure 
tissue stiffness (Venkatesh and Ehman 2015). 


= Nuclear Medicine Imaging 

In nuclear medicine imaging, the imaging 
approach is a reverse of the radiographic 
imaging: instead of the imaging beam being 
outside the subject and projecting into the 
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subject, the imaging source is inside the sub- 
ject and projects out. Specifically, a radioactive 
isotope is chemically attached to a biologically 
active compound (such as an analogue of glu- 
cose) and then is injected into the patient’s 
peripheral circulation. The compound col- 
lects in the specific body compartments or 
organs (such as metabolically-active tissues), 
where it is stored or processed by the body. 
The isotope emits radiation locally, and the 
radiation is measured using a special detector. 
The resultant nuclear-medicine image depicts 
the level of radioactivity that was measured at 
each spatial location of the patient. Because 
the counts are inherently quantized, digital 
images are produced. Multiple images also 
can be processed to obtain temporal dynamic 
information, such as the rate of arrival or of 
disappearance of isotope at particular body 
sites. 

Nuclear medicine images, like radio- 
graphic images, are usually acquired as pro- 
jections—a large planar detector is positioned 
outside the patient and it collects a projected 
image of all the radioactivity emitted from the 
patient. The images are similar in appearance 
to radiographic projection images. However, 
since the photon flux is extremely low (to 
minimize the radiation dose to the patient), 
the spatial resolution of nuclear medicine 
images is low. On the other hand, since the 
only places where radioisotope accumulates 
will be places in the body that are targeted by 
the injected agent, nearly all the information 
in nuclear medicine images is functional infor- 
mation; thus nuclear imaging methods have 
high functional information and low spatial 
resolution (@ Fig. 10.2). Nuclear medicine 
techniques have recently attracted much atten- 
tion because of an explosion in novel imaging 
probes and targeting mechanisms to localize 
the imaging agent (Drude et al. 2017). 

In addition to projection images, a com- 
puted tomography-like method called single- 
photon emission computed tomography 
(SPECT) (Alberini et al. 2011) is often used. 
A camera rotates around the patient simi- 
lar to CT, producing a computed volumetric 
image that may be viewed and navigated in 
multiple planes. A technique called Positron 
Emission Tomography (PET) uses a special 
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O Fig.10.7_ A PET image of the body in a patient with 
cancer in the left lung (same patient as in @ Fig. 10.4). 
This is a projection image taken in the frontal plane 
after injection of a radioactive isotope that accumulates 
in cancers. A small black spot in the left upper lobe is 
abnormal and indicates the cancer mass in the upper 
lobe of the left lung 


type of radioactive isotope that emits posi- 
trons, which, upon encountering an electron, 
produces an annihilation event that sends out 
two gamma rays in opposite directions that 
are simultaneously detected on an annular 
detector array and used to compute a cross 
sectional slice through the patient, similar to 
CT and SPECT (Ø Fig. 10.7). These volu- 
metric nuclear medicine imaging methods, 
like the projection methods, have high func- 
tional information and low spatial resolution. 
However, recently a newer modality called 
PET/CT has been developed that integrates a 
PET scanner and CT with image fusion (dis- 
cussed below) to get the best of both worlds— 
functional information about lesions in the 
PET image plus spatial localization of the 
abnormality on the CT image (@ Figs. 10.2 
and 10.8). 

A subdomain of nuclear imaging called 
molecular imaging has emerged that embod- 
ies this work on molecularly-targeted imag- 
ing (and therapeutic) agents (Weissleder and 
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O Fig. 10.8 A PET/CT 
fused image. The axial slice 
from the PET study 

(© Fig. 10.7) and the 
corresponding axial slice 
from the CT study 

(8 Fig. 10.4) are combined 
into a single image that has 
both good spatial resolution 
and functional information, 
showing that the lung mass 
has abnormal uptake of 
isotope, indicating it is 
metabolically active 


Mahmood 2001; Massoud and Gambhir 
2003; Biswal et al. 2007; Hoffman and 
Gambhir 2007; Margolis et al. 2007; Ray 
and Gambhir 2007; Willmann et al. 2008; 
Pysz et al. 2010). Molecularly-tagged mol- 
ecules are increasingly being introduced into 
the living organism, and imaged with optical, 
radioactive, or magnetic energy sources, often 
using reconstruction techniques and often in 
3-D. It is becoming possible to combine gene 
sequence information, gene expression array 
data, and molecular imaging to determine 
not only which genes are expressed, but where 
they are expressed in the organism (Kang and 
Chung 2008; Min and Gambhir 2008; Singh 
et al. 2008; Lexe et al. 2009; Smith et al. 2009; 
Harney and Meade 2010). These capabilities 
will become increasingly important in the 
post-genomic era for determining exactly how 
genes generate both the structure and func- 
tion of the organism. 


10.2.4 Image Quality 


a Characteristics of Image Quality 

The imaging modalities described above are 
complex devices with many parameters that 
need to be specified in generating the image, 
and most of the parameters can have sub- 
stantial impact on the following key charac- 
teristics of the final image appearance: spatial 
resolution, contrast resolution, and tempo- 
ral resolution, all of which have substantial 
impact on image quality and diagnostic value 


of the image. These characteristics provide an 

objective means for comparing images formed 

by digital imaging modalities. 

= Spatial resolution is related to the sharpness 
of the image; it is a measure of how well 
the imaging modality can distinguish 
points on the object that are close together. 
For a digital image, spatial resolution is 
generally related to the number of pixels 
per image area. Spatial resolution is critical 
for detecting abnormalities in very small 
structures, such as microcalcifications on 
mammograms or diffuse lung diseases on 
chest radiographs. 

= Contrast resolution is a measure of the 
ability to distinguish small differences in 
intensity in different regions of the image, 
which in turn are related to differences in 
measurable parameters, such as X-ray 
attenuation. For digital images, the number 
of bits per pixel is related to the contrast 
resolution of an image. Contrast resolution 
is critical to image interpretation, since 
differences in contrast are the basis for an 
object (of sufficient size) to be appreciated 
by the human eye or by a computerized 
image detection algorithm. 

= Temporal resolution is a measure of the 
time needed to create an image. We consider 
an imaging procedure to be a real-time 
application if it can generate images 
concurrent with the physical process it is 
imaging. At a rate of at least 30 images per 
second, it is possible to produce unblurred 
images of the beating heart. 
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Other parameters that are specifically relevant 
to medical imaging are the degree of invasive- 
ness, the dosage of ionizing radiation, the 
degree of patient discomfort, the size (porta- 
bility) of the instrument, the ability to depict 
physiologic function as well as anatomic 
structure, and the availability and cost of the 
procedure at a specific location. 

A perfect imaging modality would pro- 
duce images with high spatial, contrast, and 
temporal resolution; it would be available, low 
in cost, portable, free of risk, painless, and 
noninvasive; it would use nonionizing radia- 
tion; and it would depict physiological func- 
tion as well as anatomic structure. As seen 
above, the different modalities differ in these 
characteristics and none is uniformly strong 
across all the parameters (@ Fig. 10.2). 


= Contrast Agents 

One of the major motivators for develop- 
ment of new imaging modalities is the desire 
to increase contrast resolution. A contrast 
agent is a substance introduced into the body 
to enhance the imaging contrast of structures 
or fluids in medical imaging. Contrast agents 
can be introduced in various ways, such as by 
injection, inspiration, ingestion, or enema. 
The chemical composition of contrast agents 
vary with modality so as to be optimally visible 
based on the physical basis of image forma- 
tion. For example, iodinated contrast agents 
are used in radiography and CT because 
iodine has high atomic number, greatly atten- 
uating X-rays, and thus greatly enhancing 
image contrast in any tissues that accumulate 
the contrast agent. Contrast agents for radi- 
ography are referred to as “radiopaque” since 
they absorb X-rays and obscure the beam. 
Contrast agents in radiography are used to 
highlight the anatomic structures of interest 
(e.g., stomach, colon, urinary tract). In an 
imaging technique called angiography, a con- 
trast agent is injected into the blood vessels to 
opacify them on the images. In pathology, his- 
tological staining agents such as haematoxylin 
and eosin (H&E) have been used for years to 
enhance contrast in tissue sections, and mag- 
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netic contrast agents such as gadolinium have 
been introduced to enhance contrast in MR 
images. 

Recently, contrast agents have been devel- 
oped for ultrasound to greatly enhance image 
contrast (Durot et al. 2018). Ultrasound con- 
trast agents generally comprise microbub- 
bles—bubbles in the blood that are too small 
to cause damage to tissues, but that in aggre- 
gate alter the impedance mismatch between 
blood and tissue to enhance image contrast. 

Although contrast agents have been very 
successful and they are commonly used, their 
enhancement tends to be non-specific in that 
any vascularized structure will be enhanced. 
In recent years, advances in molecular biol- 
ogy have led to the ability to design contrast 
agents that are highly specific for individual 
molecules. In addition to radioactively tagged 
molecules used in nuclear medicine, molecules 
are tagged for imaging by magnetic resonance 
and optical energy sources. Tagged molecules 
are imaged in 2-D or 3-D, often by applica- 
tion of reconstruction techniques developed 
for clinical imaging (Pysz et al. 2010; Jokerst 
and Gambhir 2011; Weissleder et al. 2016). 
Tagged molecules have been used for several 
years in vitro by such techniques as immuno- 
cytochemistry (binding of tagged antibodies 
to antigen) (Van Noorden 2002) and in situ 
hybridization (binding of tagged nucleotide 
sequences to DNA or RNA) (King et al. 
2000). More recently, methods have been 
developed to image these molecules in the liv- 
ing organism, thereby opening up entirely new 
avenues for understanding the functioning of 
the body at the molecular level (Biswal et al. 
2007; Hoffman and Gambhir 2007; Margolis 
et al. 2007; Ray and Gambhir 2007; Willmann 
et al. 2008; Pysz et al. 2010). Recent work in 
altering microbubbles of ultrasound contrast 
agents to target them to particular tissues and 
types of disease raises the exciting prospects 
for ultrasound imaging to provide even greater 
functional information about tissues in a 
minimally invasive and cost effective manner 
(Deshpande et al. 2010; Abou-Elkacem et al. 
2015; Zhang et al. 2017). 
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10.2.5 Imaging Methods in Other 


Medical Domains 


Though radiology is a core domain and driver 
of many clinical problems and applications 
of medical imaging, several other medical 
domains are increasingly relying on imaging to 
provide key information for biomedical discov- 
ery and clinical insight. The methods of bio- 
medical informatics presented in this chapter, 
while focusing on radiology in our examples, 
are generalizable and applicable to these other 
domains. We briefly highlight these other 
domains and the role of imaging in them. 


ma Microscopic/cellular imaging 

At the microscopic level, there is a rapid 
growth in cellular imaging (Larabell and 
Nugent 2010; Toomre and Bewersdorf 2010; 
Wessels et al. 2010), including use of com- 
putational methods to evaluate the features 
in cells (Carpenter et al. 2006). The confo- 
cal microscope uses electronic focusing to 
move a two-dimensional slice plane through 
a three-dimensional tissue slice placed in a 
microscope. The result is a three-dimensional 
voxel array of a microscopic, or even submi- 
croscopic, specimen (Wilson 1990; Paddock 
1994). Confocal endomicroscopy, in which 
high resolution microscopic imaging technol- 
ogy is integrated into endoscopes, is opening 
up exciting opportunities for real-time his- 
topatological evaluation in disease (Neumann 
et al. 2010). At the electron microscopic level 
electron tomography generates 3-D images 
from thick electron-microscopic sections 
using techniques similar to those used in CT 
(Perkins et al. 1997). 


a= Pathology/tissue imaging 

The radiology department was revolution- 
ized by the introduction of digital imaging 
and Picture Archiving and Communication 
Systems (PACS). Pathology has likewise 
begun to shift from an analog to a digital 
workflow (Leong and Leong 2003; Gombas 
et al. 2004). Pathology informatics is a rap- 
idly emerging field (Becich 2000; Gabril and 
Yousef 2010), with goals and research prob- 


lems similar to those in radiology, such as 
managing huge images, improving efficiency 
of workflow, learning new knowledge by min- 
ing historical cases, identifying novel imag- 
ing features through correlative quantitative 
imaging analysis, and decision support. A 
particularly promising area is deriving novel 
quantitative image features from pathology 
images to improve characterization and clini- 
cal decision making (Giger and MacMahon 
1996; Nielsen et al. 2008; Armstrong 2010) 
or to improve detection of disease within the 
specimen (Nagarkar et al. 2016; Ehteshami 
Bejnordi et al. 2017). Given that pathology 
and radiology produce images that charac- 
terize phenotype of disease, there is tremen- 
dous opportunity for information integration 
and linkage among pathology, radiology, and 
molecular data for discovery (Permuth et al. 
2016). 


ma Ophthalmologic imaging 

Visualization of the retina is a core task of 
ophthalmology to diagnose disease and to 
monitor treatment response (Bennett and 
Barry 2009). Imaging modalities include 
retinal photography, autofluorescence, and 
fluorescein angiography. Recently, tomo- 
graphic-based imaging has been introduced 
through a technique called optical coherence 
tomography (OCT; B Fig. 10.9) (Figurska 
et al. 2010). This modality is showing great 
progress in evaluating a variety of retinal dis- 
eases (Freton and Finger 2011; Schimel et al. 
2011; Sohrab et al. 2011; de Sisternes et al. 
2014). As with radiological imaging, a num- 
ber of quantitative and automated segmen- 
tation methods are being created to evaluate 
disease objectively (Cabrera Fernandez et al. 
2005; Baumann et al. 2010; Hu et al. 2010a, 
b; Niu et al. 2016; de Sisternes et al. 2017a, b). 
Likewise, image processing methods for image 
visualization and fusion are being developed, 
similar to those used in radiology. 


ma Dermatologic imaging 

Imaging is becoming an important compo- 
nent of dermatology in the management of 
patients with skin lesions. Dermatologists 
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Vitreous 


O Fig. 10.9 An OCT image of the retina. Like ultra- 
sound, OCT produces an image slice at any arbitrary 
angle (depending on how the light beam can be ori- 
ented), but it is limited to visualizing superficial struc- 
tures due to poor penetration by light. In this image, the 
layered structure of the retina can be seen, as well as 
abnormalities (drusen) 


frequently take photographs of patients with 
skin abnormalities, and while initially this was 
done for clinical documentation, increasingly 
this is done to leverage imaging informat- 
ics methods for training, to improve clinical 
care, for consultation, for monitoring progres- 
sion or change in skin disease, and for image 
retrieval (Bittorf et al. 1997; Diepgen and 
Eysenbach 1998; Eysenbach et al. 1998; Lowe 
et al. 1998; Ribaric et al. 2001). Like radiology 
and pathology, recent work is being done to 
analyze the image content to enable decision 
support (Seidenari et al. 2003; Esteva et al. 
2017). 


10.3 Image Content 
Representation 


The image contents comprise two compo- 
nents of information, the visual content and 
the knowledge content. The visual content is 
the raw values of the image itself, the infor- 
mation that acomputer can access in a digital 
image directly. The knowledge content arises 
as the observer, who has biomedical knowl- 
edge about the image content, views the visual 
information in the image. For example, a radi- 
ologist viewing a CT image of the upper abdo- 
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men immediately recognizes that the image 
contains the liver, spleen, and stomach (ana- 
tomic entities), as well as image abnormalities 
such as a mass in the liver with rim enhance- 
ment (imaging observation entities). Unlike 
the visual content, the knowledge content of 
images is not directly accessible to comput- 
ers from the image itself. However, semantic 
methods are being developed to make this 
content machine-accessible (> Sect. 10.3.2). 
In this section we describe imaging informat- 
ics methods for representing the visual and 
knowledge content of images. 


10.3.1 Representing Visual Content 


in Digital Images 


The visual content of digital images typi- 
cally is represented in a computer by a two- 
dimensional array of numbers (a bit map). 
Each element of the array represents the 
intensity of a small square area of the picture, 
called a picture element (or pixel). Each pixel 
element corresponds to a volume element (or 
voxel) in the imaged subject that produced the 
pixel. If we consider the image of a volume, 
then a three-dimensional array of numbers is 
required. Another way of thinking of a vol- 
ume is that it is a stack of two-dimensional 
images. However, it is also important to be 
aware of the voxel dimensions that corre- 
spond to the pixels when doing this. In many 
2-D imaging applications, the in-plane resolu- 
tion (the size of the voxels in the x,y plane) is 
higher than the resolution in the z-axis (i.e., 
the slice thickness); this is often referred to as 
“non-isotropic voxels.” Non-isotropic voxels 
creates a problem when re-sampling the vol- 
ume data to create other projections, such as 
coronal or saggital from primary axial image 
data. If the dimensions of the voxels (and 
pixels) are uniform in all dimensions, they are 
referred to as “isotropic.” Fortunately, nearly 
all modern volumetric imaging methods (e.g., 
CT and MRI) currently produce images with 
isotropic voxels. 

We can store any image in a computer 
as a matrix of integers (or real-valued num- 
bers), either by converting it from an analog 
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to a digital representation or by generating it 
directly in digital form. Once an image is in 
digital form, it can be handled just like all 
other data. It can be transmitted over com- 
munications networks, stored compactly in 
databases on magnetic or optical media, and 
displayed on graphics monitors. In addition, 
the use of computers has created an entirely 
new realm of capabilities for image generation 
and analysis: images can be computed rather 
than measured directly. Furthermore, digi- 
tal images can be manipulated for display or 
analysis in ways not possible with film-based 
images. 

In addition to the 2D (slice) and 3D (vol- 
ume) representation for image data, there can 
be additional dimensions to representing the 
visual content of images. It is often the case 
that multi-modality data are required for the 
diagnosis; this can be a combination of vary- 
ing modalities, (e.g., CT and PET, CT and 
MRI) and can be a combination of imaging 
sequences within a modality (e.g., T1, T2, or 
other sequences in MRI) (@ Fig. 10.10). Pixel 
(or voxel) content, from each of the respec- 
tive acquisition modalities, are combined in 
what is known as a “feature-vector” in the 
multi-dimensional space. For example, a 
3-dimensional intensity-based feature vec- 
tor, based on 3 MRI pulse sequences, can be 


defined as a set of three values for each pixel 
in the image, where the intensity of each pixel 
in each of the three MRI images is extracted 
and recorded (e.g., [Intensity(Sequence 1), 
Intensity(Sequence 2), Intensity(Sequence 3)]. 
Any imaging performed over time (e.g., car- 
diac echo videos) can be represented by the set 
of values at each time point, thus the time is 
added as an additional dimension to the rep- 
resentation. 

Finally, in addition to representing the 
visual content, medical images also need 
to represent certain information about that 
visual content (referred to as image meta- 
data). Image metadata include such things as 
the name of the patient, date the image was 
acquired, the slice thickness, the modality that 
was used to acquire the image, etc. All image 
metadata are usually stored in the header of 
the image file. Given that there are many dif- 
ferent types of equipment and software that 
produce and consume images, standards are 
crucial. For images, the Digital Imaging and 
Communications in Medicine (DICOM) 
standard is for distributing and viewing any 
kind of medical image regardless of the origin 
(Bidgood Jr. and Horii 1992). DICOM has 
become pervasive throughout radiology and 
is becoming a standard in other domains such 
as pathology, ophthalmology, and dermatol- 


T1 SCAN 
WITHOUT CONTRAST 


O Fig. 10.10 Multi-modality imaging. Images of the 
brain from three modalities (T1 without contrast, T1 
with contrast, and T2) are shown. The patient has a 
lesion in the left occipital lobe that has distinctive image 


T1 SCAN 
WITH CONTRAST 


T2- SCANs 


features on each of these modalities, and the combina- 
tion of these different features on different modalities 
establishes characteristic patterns useful in diagnosis 
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ogy. In addition to specifying a standard file 
syntax and metadata structure, DICOM spec- 
ifies a standard protocol for communicating 
images among imaging devices. 


10.3.2 Representing Knowledge 
Content in Digital Images 


As noted above, the knowledge content related 
to images is not directly encoded in the images, 
but it is recognized by the observer of the 
images. This knowledge includes recognition 
of the anatomic entities in the image, imaging 
observations and characteristics of the observa- 
tions (sometimes called “findings”), and inter- 
pretations (probable diseases). Representing 
this knowledge in the imaging domain is similar 
to knowledge representation in other domains 
of biomedical informatics (see >» Chap. 24). 
Specifically, for representing the entities in the 
domain of discourse, we adopt terminologies 
or ontologies. To make specific statements 
about individuals (images), we use “informa- 
tion models” (described below) that reference 
ontological entities as necessary. As described 
below, different aspects of the knowledge con- 
tent of images is stored in different ways (which 
is one of the challenges of leveraging this infor- 
mation). 


m Knowledge Representation of Anatomy 
Given segmented anatomical structures, 
whether at the macroscopic or microscopic 
level, and whether represented as 3-D surface 
meshes or extracted 3-D regions, it is often 
desirable to attach labels (names) to the struc- 
tures in images. If the names are drawn from a 
controlled terminology or ontology, they can 
be used as an index into a database of seg- 
mented structures, thereby providing a quali- 
tative means for comparing structures from 
multiple subjects or retrieving images contain- 
ing particular structures. 

If the terms in the vocabulary are orga- 
nized so as to assert relationships that are true 
for all instances (the case in “ontologies”), 
they can support systems that manipulate and 
retrieve image contents in intelligent ways. 
If anatomical ontologies are linked to other 
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ontologies of physiology and pathology they 
can provide increasingly sophisticated knowl- 
edge about the meaning of the various images 
and other data that are increasingly becom- 
ing available in online databases. This kind 
of knowledge (by the computer, as opposed 
to the scientist) will be required in order to 
achieve the seamless integration of all forms 
of imaging and non-imaging data. 

At the most fundamental level, Nomina 
Anatomica (International Anatomical 
Nomenclature Committee 1989) and its suc- 
cessor, Terminologia Anatomica (Federative 
Committee on Anatomical Terminology 1998) 
provide a classification of officially sanctioned 
terms that are associated with macroscopic 
and microscopic anatomical structures. This 
canonical term list, however, has been substan- 
tially expanded by synonyms that are current 
in various fields, and has also been augmented 
by a large number of new terms that desig- 
nate structures omitted from Terminologia 
Anatomica. Many of these additions are pres- 
ent in various controlled terminologies (e.g., 
MeSH (National Library of Medicine 1999), 
SNOMED (Spackman et al. 1997), Read Codes 
(Schultz et al. 1997), GALEN (Rector et al. 
1993)). Unlike Terminologia these vocabular- 
ies are entirely computer-based, and therefore 
lend themselves for incorporation in computer- 
based applications. 

Classification and ontology projects to 
date have focused primarily on arranging the 
terms of a particular domain in hierarchies. 
As noted with respect to the evaluation of 
Terminologia Anatomica (Rosse 2000), insuf- 
ficient attention has been paid to the relation- 
ships among these terms. These relationships 
are named (e.g., “is-a” and “part-of”) to indi- 
cate how the entities connected by them are 
related (e.g., Left Lobe of Liver part-of Liver). 
Linking entities with relations encodes knowl- 
edge and is used by computer reasoning appli- 
cations in making inferences. Terminologia, 
as well as anatomy sections of the controlled 
medical terminologies, mix -is a- and -part of- 
relationships in the anatomy segments of their 
hierarchies. Although such heterogeneity does 
not interfere with using these term lists for 
keyword-based retrieval, these programs will 
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fail to support higher level knowledge (rea- 
soning) required for knowledge-based appli- 
cations. To meet this gap, the Foundational 
Model of Anatomy (FMA) was developed to 
define a comprehensive symbolic description 
of the structural organization of the body, 
including anatomical concepts, their preferred 
names and synonyms, definitions, attributes 
and relationships (Rosse et al. 1998a, b; Rosse 
and Mejino 2003). 

In the FMA, anatomical entities are 
arranged in class-subclass hierarchies, with 
inheritance of defining attributes along the 
is-a link, and other relationships (e.g., parts, 
branches, spatial adjacencies) represented 
as additional descriptors associated with 
the concept. The FMA currently consists of 
over 75,000 concepts, represented by about 
120,000 terms, and arranged in over 2.1 mil- 
lion links using 168 types of relationships. 
These concepts represent structures at all 
levels: macroscopic (to 1 mm resolution), cel- 
lular and macromolecular. Brain structures 
have been added by integrating NeuroNames 
with the FMA as a Foundational Model of 
Neuroanatomy (FMNA) (Martin et al. 2001). 

The FMA can be useful for symboli- 
cally organizing and integrating biomedical 
information, particularly that obtained from 
images. But in order to answer non-trivial 
queries in neuroscience and other basic sci- 
ence areas, and to develop “smart tools” 
that rely on deep knowledge, additional 
ontologies must also be developed (e.g., for 
physiological functions mediated by neu- 
rotransmitters, and pathological processes 
and their clinical manifestations, as well for 
the radiological appearances with which 
they correlate). The relationships that exist 
among these concepts and anatomical parts 
of the body must also be explicitly modeled. 
Next-generation informatics efforts that link 
the FMA and other anatomical ontologies 
with separately developed functional ontolo- 
gies will be needed in order to accomplish 
this type of integration. 


=» Knowledge Representation of Radiology 
Imaging Features 

While FMA provides a comprehensive knowl- 

edge representation for anatomy, it does not 


cover other portions of the radiology domain. 
As is discussed in > Chap. 7, there are con- 
trolled terminologies in other domains, such 
as MeSH, SNOMED, and related terminolo- 
gies in the UMLS (Cimino 1996; Bodenreider 
2008); however, these lack terminology spe- 
cific to radiology for describing the features 
seen in imaging. The Radiological Society 
of North America (RSNA) recently devel- 
oped RadLex, a controlled terminology 
for radiology (Langlotz 2006; Rubin 2008). 
The primary goal of RadLex is to provide a 
means for radiologists to communicate clear, 
concise, and orderly descriptions of imaging 
findings in understandable, unambiguous lan- 
guage. Another goal is to promote an orderly 
thought process and logical assessments and 
recommendations based on observed imaging 
features based on terminology-based descrip- 
tion of radiology images and to enable deci- 
sion support (Baker et al. 1995; Burnside et al. 
2009). Another goal of RadLex is to enable 
radiology research; data mining is facilitated 
by the use of standard terms to code large col- 
lections of reports and images (Channin et al. 
2009a, b). 

RadLex includes thousands of descrip- 
tors of visual observations and characteris- 
tics for describing imaging abnormalities, as 
well as terms for naming anatomic structures, 
radiology imaging procedures, and diseases 
(@ Fig. 10.11). Each term in RadLex con- 
tains a unique identifier as well as a variety 
of attributes such as definition, synonyms, 
and foreign language equivalents. In addition 
to a lexicon of standard terms, the RadLex 
ontology includes term relationships—links 
between terms to relate them in various ways 
to encode radiological knowledge. For exam- 
ple, the is-a relationship records subsumption. 
Other relationships include part-of, connec- 
tivity, and blood supply. These relationships 
are enabling computer-reasoning applications 
to process image-related data annotated with 
RadLex. 

RadLex has been used in several imag- 
ing informatics applications, such as to 
improve search for radiology information. 
RadLex-based indexing of radiology jour- 
nal figure captions achieved very high preci- 
sion and recall, and significantly improved 
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O Fig. 10.11 RadLex controlled terminology 
(> http://radlex.org). RadLex includes term hierarchies 
for describing anatomy (“anatomical entity”), imaging 
observations (“imaging observation”) and characteris- 
tics (“imaging observation characteristic”), imaging 
procedures and procedure steps (“procedure step”), dis- 


image retrieval over keyword-based search 
(Kahn and Rubin 2009). RadLex has been 
used to index radiology reports (Marwede 
et al. 2008). Work is underway to introduce 
RadLex controlled terms into radiology 
reports to reduce radiologist variation in 
use of terms for describing images (Kahn Jr. 
et al. 2009). Tools are beginning to appear 
enabling radiologists to annotate and query 
image databases using RadLex and other 
controlled terminologies (Rubin et al. 2008b; 
Channin et al. 2009a, b). 

In addition to RadLex, there are other 
important controlled terminologies for 
radiology. The Breast Imaging Reporting 
and Data System (BI-RADS) is a lexicon 
of descriptors and a reporting structure 
comprising assessment categories and man- 
agement recommendations created by the 
American College of Radiology (D’Orsi and 
Newell 2007). Terminologies are also being 
created in other radiology imaging domains, 
including the Fleischner Society Glossary 
of terms for thoracic imaging (Hansell et al. 
2008), the Nomenclature of Lumbar Disc 
Pathology (Appel 2001), terminologies for 
image guided tumor ablation (Goldberg 
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eases (“pathophysiologic process”), treatments (“treat- 
ment”), and components of radiology reports 
(“report”). Each term includes definitions, preferred 
name, image exemplars, and other term metadata and 
relationships such as subsumption 


et al. 2009) and transcatheter therapy for 
hepatic malignancy (Brown et al. 2009), and 
the CT Colonography Reporting and Data 
System (Zalis et al. 2005). 


=» Knowledge Representation of Radiology 
Procedures 

A very important type of knowledge represen- 
tation for images is the type of imaging proce- 
dure that produced it. While RadLex contains 
atomic terms for the various modalities, such 
as CT and MRI, it lacks the full spectrum of 
types of procedures that can be performed 
on a patient. The RadLex Playbook (Wang 
et al. 2017) is a project of the Radiological 
Society of North America that provides a 
standard system for naming radiology pro- 
cedures, based on atomic terms (usually from 
RadLex) that define an imaging procedure, 
such as “CT Head.” By providing standard 
names and codes for radiologic studies, the 
RadLex Playbook can facilitate a variety of 
operational and quality improvement efforts, 
including workflow optimization, chargemas- 
ter management, radiation dose tracking, 
enterprise integration and image exchange 
(Wang et al. 2017). 
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The RadLex Playbook grammar describes 
how to create the pre-coordinated Playbook 
terms across the defining name attributes. 
Each such term is comprised of several 
RadLex atomic terms. The unique combina- 
tion of RadLex clinical terms defines a unique 
Playbook term, which is given a unique iden- 
tifier (the RadLex Playbook ID, or RPID). 
Thus, for each RPID there is a corresponding 
set of RadLex IDs that link to the associated 
RadLex clinical terms. This knowledge rep- 
resentation can be very useful for retrieving 
particular types of images from systems that 
support Playbook, such as “retrieve all CT 
of the head,” which would include CT Head 
w/wo (with and without contrast agent), CT 
Head Angio wlwo, CT Orbits wo, CT Temporal 
Bone w/wo, etc. 

The Logical Observation Identifiers, 
Names, and Codes (LOINC®) terminol- 
ogy includes radiology terms (Vreeman and 
McDonald 2005), and recently a unified 
model LOINC/Playbook model and termi- 
nology for radiology procedure names has 
been created that represents the attributes of 
term names with an extensible set of values 
and provides LOINC codes and display name 
for each procedure (Vreeman et al. 2018). 
There is also a single integrated governance 
process for managing the unified terminology. 


= Semantic Representation of Image 
Contents 

While ontologies and controlled terminolo- 
gies are useful for representing knowledge 
related to images, they do not provide a means 
to directly encode assertions for recording the 
semantic content in images. For example, we 
may wish to record the fact that “there is a 
mass 4x5 cm in size in the right lobe of the 
liver.” The representation of this seman- 
tic image content certainly will use ontolo- 
gies and terminologies to record the entities 
to which such assertions refer; however, an 
information model is required to provide the 
required grammar and syntax for recording 
such assertions. There are two approaches to 
recording these assertions, no formal informa- 
tion model (narrative text) and a formal infor- 
mation model. 


== Narrative text 

In the current workflow, nearly all seman- 
tic image content is recorded in narrative 
text (radiology reports). The advantage of 
text reports is that they are simple, quick to 
produce (the radiologist speaks freely into a 
microphone), and they can be expressive, cap- 
turing the subtle nuances (and ambiguities) 
that the English language provides. There 
are several downsides, however. First, text 
reports are unstructured; there is no adher- 
ence to controlled terminology and not con- 
sistent structure that would permit reliable 
information extraction. Second, the reports 
may be incomplete, vague, or contradictory. 
Further, free text is challenging for comput- 
ers (see > Chap. 8), which makes it difficult 
to leverage free text in applications. Finally, 
radiology images and the corresponding radi- 
ologist report are currently disconnected; e.g., 
the report may describe a mass in an organ, 
and the image may contain a region of inter- 
est (ROI) measuring the lesion, but there is no 
information directly linking the description of 
the lesion in the report with the ROI in the 
image. Such linkage could enable applica- 
tions such as content-based image retrieval, 
as described below. 

Structured reporting of radiology results 
has recently become increasingly the standard 
process for generating reports, usually using 
macros and templates (Weiss and Langlotz 
2008; Langlotz 2009; Schwartz et al. 2011). In 
structured reports, a variety of fields is pro- 
vided, such as a list of organs visualized on 
the image and a list of radiologist observa- 
tions about them. Though structured reports 
improve on the ability to recognize and extract 
particular types of information in reports, 
they generally do not use controlled terminol- 
ogies, and their content is usually recorded in 
narrative free text, so this method of record- 
ing semantic information about images is not 
much more computer-accessible than a fully 
narrative radiology report. 


ma Information model 

An information model provides an explicit 
specification of the types of data to be col- 
lected and the syntax by which it will be saved. 
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"There is a 
hypodense mass 
measuring 4.5 x 3.5 
cm in the right lobe 
of the liver, likely a 
metastasis.” 


Terminology 


O Fig. 10.12 Semantic annotation of images. The 
radiologist’s image annotation (left) and interpretation 
(middle) associated with the annotation are not repre- 
sented in a form such that the detailed content is directly 
accessible. The same information can be put into a 
structured representation as a semantic annotation 


So-called “semantic annotation” methods 
are being developed to adapt the semantic 
content about images that would have been 
put into narrative text so that it can instead 
be put in structured annotations compliant 
with the information model. The information 
model conveys the pertinent image informa- 
tion explicitly and in human-readable and 
machine accessible format. For example, a 
semantic annotation might record the coor- 
dinates of the tip of an arrow and indicate 
the organ (anatomic location) and imaging 
observations (e.g., mass) in that organ. These 
annotations can be recorded in a standard, 
searchable format, such as the Annotation 
and Image Markup (AIM) schema, developed 
by the National Cancer Imaging Program of 
NCI for storing and sharing image metadata 
(caBIG In-vivo Imaging Workspace 2008; 
Rubin et al. 2008a, 2009a). AIM captures a 
variety of information about image annota- 
tions, e.g., regions of interest, a lesion iden- 
tification label, lesion location, measurements 
of image regions, method of measurement, 
radiologist observations, anatomic locations 
of abnormalities, calculations, inferences, 
and other qualitative and quantitative image 
features (Channin et al. 2009a, b). The image 
metadata also include information about the 
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Semantic Annotation 


(right), comprising terms from controlled terminologies 
(Systematized Nomenclature of Medicine (SNOMED) 
and RadLex) as well as numeric values (coordinates and 
measurements). (Figure reprinted with permission from 
(Rubin and Napel 2010). © Thieme) 


image, such as the name of imaging procedure 
and how or when the image was acquired. 
AIM supports controlled terminologies, 
enabling semantic interoperability. AIM has 
recently been incorporated into the DICOM 
Structured Report (DICOM SR) standard 
(DICOM Standards Committee - Working 
Group 8 — Structured Reporting 2017), with 
specifications for saving AIM in DICOM-SR 
(DICOM Standards Committee 2017). Given 
that DICOM is the international standard for 
specifying image data, it is hoped that there 
will be widespread adoption of the AIM/ 
DICOM-SR format to enable interoperability 
of image annotations across systems. 

The AIM information model includes use 
of controlled terms as semantic descriptors 
of lesions (e.g., RadLex). It also provides a 
syntax associating an ROI in an image with 
the aforementioned information, enabling raw 
image data to be linked with semantic infor- 
mation, and thus bridges the current discon- 
nect between semantic terms and the lesions 
in images being described. In conjunction 
with RadLex, the AIM information model 
provides a standard syntax (in XML schema) 
to create a structured representation of the 
semantic contents of images (@ Fig. 10.12). 
Once the semantic contents are recorded in 
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O Fig. 10.13 The electronic Imaging Physician Anno- 
tation Device (ePAD). This tool creates structured 
semantic annotations on images using a graphical inter- 
face to minimize impact on image viewing workflow. 
The user views the image in and draws a region of inter- 
est (left). ePAD incorporates ontologies so that users 
can specify controlled terms as values in making their 


AIM (as XML instances of the AIM XML 
schema), applications can be developed for 
image query and analysis. 

AIM has been gaining traction in the 
research community. A number of diverse 
research projects have embraced and have 
been enabled by AIM (Levy et al. 2009, Napel 
et al. 2010, Gevaert et al. 2011, Gimenez et al. 
2011a, b, Hoang et al. 2011, Levy and Rubin 
201 1a, b, Napelet al. 2011, Plevritis et al. 2011, 
Gevaert et al. 2012a, b; Levy et al. 2012). An 
increasing number of tools are supporting AIM 
to facilitate creating semantic annotations on 
images as part of the image viewing workflow 
are being developed, including open source 
projects such as Osirix (Rubin and Snyder 
2011), ClearCanvas (Klinger 2010; National 
Cancer Institute 2012), Slicer (Pieper et al. 
2004; Fedorov 2012; Fedorov et al. 2012), and 
ePAD (Rubin and Snyder 2011). There are 
also several commercial applications using 
AIM that are in development (Rubin et al. 
2010; Zimmerman et al. 2011). Automated 


Annotations linked to images 


annotations (pull down panel on right). As they make 
their annotation, they receive feedback to ensure data 
entries are complete and that there are no violations of 
pre-specified annotation logic (panel on lower right). 
The ePAD tool saves image annotations in the AIM 
information model XML format 


semantic image annotation methods are 
also being pursued (Carneiro et al. 2007; 
Mechouche et al. 2008; Yu and Ip 2008) that 
will ultimately make the process of generating 
this structured information efficient. 

The electronic Imaging Physician Anno- 
tation Device (ePAD (Rubin and Snyder 2011)) 
is a freely available web-based image viewing 
and AIM-compliant annotation tool. ePAD 
permits the user to draw image annotations in 
a manner in which they are accustomed while 
viewing images, while simultaneously collecting 
semantic information about the image and the 
image region directly from the image itself as 
well as from the user using a structured report- 
ing template (@ Fig. 10.13). The tool also fea- 
tures a panel to provide feedback so as to ensure 
complete and valid annotations. Image annota- 
tions are saved in the AIM XML format. 

By making the semantic content of images 
explicit and machine-accessible, these struc- 
tured annotations of images will help radiolo- 
gists analyze data in large databases of images. 
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O Fig. 10.14 ePAD and AIM within the clinical/ 
research environment. (1) Images are acquired and 
stored in the hospital PACS, (2) the Radiologist uses 
ePAD to review the images and to make measurements 
on cancer lesions, (3) The measurements (saved as AIM 
XML in ePAD) with links to the images are stored by 


O Figure 10.14 shows how image annota- 
tions in AIM can be integrated into rou- 
tine research and clinical workflows. Images 
acquired from imaging devices flow into the 
PACS and can be viewed on an imaging work- 
station. If the imaging workstation supports 
AIM (e.g., ePAD as shown in the figure), then 
image annotations and radiologist observa- 
tions are saved in the AIM format and stored 
in database of AIM annotation files (an XML 
database in the case of ePAD). The AIM 
annotations have a pointer to their corre- 
sponding images and queries and analyses can 
be done on the image annotations for clinical 
applications, such as summarizing the change 
in cancer lesion sizes over time for assessing 
response assessment to cancer (B Fig. 10.14). 
Cancer patients often have many serial imag- 
ing studies in which a set of lesions is evalu- 
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the AIM Data Service, and (4) a variety of software 
applications can use the AIM Data Service to access the 
image metadata for different purposes, such as listing 
the lesion measurements or generating a summary of 
patient response assessment 


ated at each time point. Automated tools such 
as ePAD can use semantic image annotations 
to identify the measurable lesions at each time 
point and produce a summary of, and auto- 
matically reason about, the total tumor bur- 
den over time, helping physicians to determine 
how well patients are responding to treatment 
(Levy and Rubin 2008). 


um Atlases 

Spatial representations of anatomy, in the 
form of segmented regions on 2-D or 3-D 
images, or 3-D surfaces extracted from image 
volumes, are often combined with symbolic 
representations to form digital atlases. A digi- 
tal atlas (which for this chapter refers to an 
atlas created from 3-D image data taken from 
real subjects, as opposed to artists’ illustra- 
tions) is generally created from a single indi- 
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Fig. 10.15 The Digital Anatomist Dynamic Scene 
Generator. The user can select a set of 3-D anatomical 
models, in which each model is associated with an entity 
from the Foundational Model of Anatomy (FMA) to 
select a starting structure, load relations from the FMA 
between that structure and related structures, and, if 
3-D models are available, to add those models to the 
scene. The evolving scene can be manipulated in real- 
time on the web and can be saved as a standalone scene 
that can be saved locally or accessed via URL, thus per- 
mitting it to be embedded in other web apps. The scene 


vidual, which therefore serves as a “canonical” 
instance of the species. Traditionally, atlases 
have been primarily used for education, and 
most digital atlases are used the same way. 

As an example in 2-D, the Digital 
Anatomist Interactive Atlases (Sundsten 
et al. ) were created by outlining ROIs 
on 2-D images (many of which are snapshots 
of 3-D scenes generated by reconstruction 
from serial sections) and labeling the regions 
with terminology from the FMA. The atlases, 
which are available on the web, permit inter- 
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shown here was created by first requesting one of the 
muscles of the thorax, then exploring FMA to find the 
heart, aorta and branches, and thoracic vertebral col- 
umn. In the scene on the left, the first thoracic vertebra 
was clicked, which caused it to be highlighted, and the 
relations between it and other structures are then shown 
in the panel on the right. The top pane of the right 
panel = shows the structures with which the first tho- 
racic vertebra articulates. (Figure used with permission 
from Jim Brinkley) 


active browsing, where the names of struc- 
tures are given in response to mouse clicks; 
dynamic creation of “pin diagrams”, in which 
selected labels are attached to regions on the 
images; and dynamically-generated quizzes, in 
which the user is asked to point to structures 
on the image (Brinkley et al. ). 

As an example 3-D, the Digital 
Anatomist Dynamic Scene Generator (DSG, 

Fig. ) creates interactive 3-D atlases 
“on-the-fly” for viewing and manipulation 
over the web (Brinkley et al. ; Wong et al. 
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1999). An example of a 3-D brain atlas cre- 
ated from the Visible Human is Voxelman 
(Hohne et al. 1995), in which each voxel in the 
Visible Human head is labeled with the name 
of an anatomic structure in a “generalized 
voxel model” (Hohne et al. 1990), and highly- 
detailed 3-D scenes are dynamically gener- 
ated. Several other brain atlases have also 
been developed, primarily for educational 
use (Johnson and Becker 2001; Stensaas and 
Millhouse 2001). 

Atlases have also been developed for 
integrating functional data from multiple 
studies (Bloom and Young 1993; Toga et al. 
1994, 1995; Swanson 1999; Fougerousse 
et al. 2000; Rosen et al. 2000; Martin and 
Bowden 2001). In their original published 
form these atlases permit manual drawing 
of functional data, such as neurotransmit- 
ter distributions, onto hardcopy printouts 
of brain sections. Many of these atlases 
have been or are in the process of being 
converted to digital form. The Laboratory 
of Neuroimaging (LONI) at the University 
of California Los Angeles has been particu- 
larly active in the development and analy- 
sis of digital atlases (Toga 2001), and the 
California Institute of Technology Human 
Brain Project has released a web-accessible 
3-D mouse atlas acquired with micro-MR 
imaging (Dhenain et al. 2001). 

The most widely used human brain atlas 
is the Talairach atlas, based on post mortem 
sections from a 60-year-old woman (Talairach 
and Tournoux 1988). This atlas introduced a 
proportional coordinate system (often called 
“Talairach space”) which consists of 12 rect- 
angular regions of the target brain that are 
piecewise affine transformed to corresponding 
regions in the atlas. Using these transforms 
(or a simplified single affine transform based 
on the anterior and posterior commissures) 
a point in the target brain can be expressed 
in Talairach coordinates, and thereby related 
to similarly transformed points from other 
brains. Other human brain atlases have also 
been developed (Schaltenbrand and Warren 
1977; Hohne et al. 1992; Caviness et al. 1996; 
Drury and Van Essen 1997; Van Essen and 
Drury 1997). 
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10.4 Image Processing 


Image processing is a form of signal pro- 
cessing in which computational methods 
are applied to an input image to produce an 
output image or a set of characteristics or 
parameters related to the image. Most image 
processing techniques involve treating the 
image as a two-dimensional signal and ana- 
lyzing it using signal-processing techniques 
or a variety of other transformations or com- 
putations. There are a broad variety of image 
processing methods, including transforma- 
tions to enhance visualization, computations 
to extract features, and systems to automate 
detection or diagnose abnormalities in the 
images. The latter two methods, referred to 
as computer-assisted detection and diagnosis 
(CAD) is discussed in > Sect. 10.5.2. In this 
section we discuss the former methods, which 
are more elemental and generic processing 
methods. 

The rapidly increasing number and types 
of digital images has created many opportu- 
nities for image processing, since one of the 
great advantages of digital images is that they 
can be manipulated just like any other kind 
of data. This advantage was evident from the 
early days of computers, and success in pro- 
cessing satellite and spacecraft images gener- 
ated considerable interest in biomedical image 
processing, including automated image anal- 
ysis to improve radiological interpretation. 
Beginning in the 1960s, researchers devoted 
a large amount of work to this end, with the 
hope that eventually much of radiographic 
image analysis could be improved. One of the 
first areas to receive attention was automated 
interpretation of chest X-ray images, because, 
previously, most patients admitted to a hos- 
pital were subjected to routine chest X-ray 
examinations. (This practice is no longer 
considered cost effective except for selected 
subgroups of patients.) Interestingly, recent 
research in deep learning (discussed below), 
however, has raised enthusiasm and hopes 
for automating certain tasks of radiographic 
image interpretation, such as detection of 
pneumonia (Rajpurkar et al. 2017). While 
most of the emphasis of image processing 
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continues to be on systems that aid the user 
in viewing and manipulating images, there is 
a quickly growing body of work completely 
automated analysis of images, particularly 
for lesion detection, image segmentation, and 
image classification (diagnosis). 

Medical image processing utilizes tools 
similar to general image processing. But there 
are unique features to the medical imagery 
that present different, and often more diffi- 
cult, challenges from those that exist in gen- 
eral image processing tasks. To begin with, the 
images analyzed all represent the 3D body; 
thus, the information extracted (be it in 2D 
or 3D) is based on a 3D volumetric object. 
The images themselves are often taken from 
multi-modalities (CT, MRI, PET), where each 
modality has its own unique physical charac- 
teristics, leading to unique noise, contrast and 
other issues that need to be addressed. The 
fusion of information across several modali- 
ties is a challenge that needs to be addressed 
as well. 

When analyzing the data, it is often desir- 
able to segment and characterize specific 
organs. The human body organs, or vari- 
ous tissue of interest within them, cannot be 
described with simple geometrical rules, as 
opposed to objects and scenes in non-medi- 
cal images that usually can be described with 
such representations. This is mainly because 
the objects and free-form surfaces in the body 
cannot easily be decomposed into simple 
geometric primitives. There is thus very little 
use of geometric shape models that can be 
defined from a-priori knowledge. Moreover, 
when trying to model the shape of an organ 
or a region, one needs to keep in mind that 
there are large inter-person variations (e.g., 
in the shape and size of the heart, liver and 
so on), and, as we are frequently analyzing 
images of patients, there is a large spectrum 
of abnormal states that can greatly modify 
tissue properties or deform structures. Finally, 
especially in regions of interest that are close 
to the heart, complex motion patterns need to 
be accounted for as well. These issues make 
medical image processing a very challenging 
domain. 

The widespread availability of digi- 
tal images, combined with image manage- 


ment systems such as PACS (» Chap. 22) 
and powerful workstations, has led to many 
applications of image processing techniques. 
In general, routine techniques are available 
on the manufacturer’s workstations (e.g., a 
vendor-provided console for an MR machine 
or an ultrasound machine), whereas more 
advanced image-processing algorithms are 
available as software packages that run on 
independent workstations. 

The primary uses of image processing 
in the clinical environment are for image 
enhancement, screening, and quantitation. 
Software for such image processing is pri- 
marily developed for use on independent 
workstations. Several journals are devoted 
to medical image processing (e.g., [EEE 
Transactions on Medical Imaging, Journal of 
Digital Imaging, Neuroimage), and the num- 
ber of journal articles is rapidly increasing 
as digital images become more widely avail- 
able. Several books are devoted to the spec- 
trum of digital imaging processing methods 
(Yoo 2004; Gonzalez et al. 2009), and the 
reader is referred to these for more detailed 
reading on these topics. We describe a few 
examples of image-processing techniques in 
the remainder of this section. 


10.4.1 Types of Image-Processing 


Methods 


Image processing methods are applied to rep- 
resentations of image content (> Sect. 10.3). 
One may use the very low-level, pixel represen- 
tation. The computational effort is minimal in 
the representation stage, with substantial effort 
(computational cost) in further analysis stages 
such as segmentation of the image, matching 
between images, registration of images, etc. 
A second option is to use a very high-level 
image content representation, in which each 
image is labeled according to its semantic 
content (medical image categories such as 
“abdomen vs chest”, “healthy vs pathol- 
ogy”). In this scenario, a substantial compu- 
tational effort is needed in the representation 
stage, including the use of automated image 
segmentation methods to recognize ROIs as 
well as advanced learning techniques to clas- 
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O Fig. 10.16 Diagram of a typical image processing pipeline 


sify the regions of image content. Further 
analysis can utilize knowledge resources such 
as ontologies, linked to the images using cat- 
egory labels. A mid-level representation exists, 
that balances the above two options, in which 
a transition is made from pixels to semantic 
features. Feature vectors are used to represent 
the spectrum of image content compactly and 
subsequent analysis is done on the feature 
vector representation. 

Image processing is the foundation for cre- 
ating image-based applications, such as image 
enhancement to facilitate human viewing, to 
show views not present in the original images, 
to flag suspicious areas for closer examination 
by the clinician, to quantify the size and shape 
of an organ, and to prepare the images for 
integration with other information. To create 
such applications, several types of image pro- 
cessing are generally performed sequentially 
in an image processing pipeline, although some 
processing steps may feed back to earlier ones, 
and the specific methods used in a pipeline 
varies with the application. Most image pro- 
cessing pipelines and applications generalize 
from two-dimensional to three-dimensional 
images, though three-dimensional images 
pose unique image processing opportunities 
and challenges. Image processing pipelines are 
generally built using one or more of the fol- 
lowing fundamental image processing meth- 
ods: global processing, image enhancement, 
image rendering/visualization, image quanti- 
tation, image segmentation, image registra- 
tion, and image reasoning (e.g., classification). 
Those steps are shown in © Fig. 10.16. In the 
remainder of this section we describe these 


methods, except for image reasoning which is 
discussed in > Sect. 10.5. 


10.4.2 Global Processing 


Global processing involves computations on 
the entire image, without regard to specific 
regional content. The purpose is generally to 
enhance an image for human visualization or 
for further analysis by the computer (“pre- 
processing”). A simple but important example 
of global image processing is gray-scale win- 
dowing of CT images. The CT scanner gener- 
ates pixel values (Hounsfield numbers, or CT 
numbers) in the range of —1000 to +3000. 
Humans, however, cannot distinguish more 
than about 100 shades of gray. To appreciate 
the full precision available with a CT image, 
the operator can adjust the midpoint and 
range of the displayed CT values. By chang- 
ing the level and width (i.e., intercept and 
slope of the mapping between pixel value and 
displayed gray scale or, roughly, the bright- 
ness and contrast) of the display, radiologists 
enhance their ability to perceive small changes 
in contrast resolution within a subregion of 
interest. 

Other types of global processing change 
the pixel values to produce an overall enhance- 
ment or desired effect on the image: histogram 
equalization, convolution, and filtering. In 
histogram equalization, the pixel values are 
changed, spreading out the most frequent 
intensity values to increase the global contrast 
of the image. It is most effective when the 
usable data of the image are represented by a 
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narrow range of contrast values. Through this 
adjustment, the intensities can be better dis- 
tributed on the histogram, improving image 
contrast by allowing for areas of lower local 
contrast to gain a higher contrast. In convo- 
lution and filtering, mathematical functions 
are applied to the entire image for a variety of 
purposes, such as de-noising, edge enhance- 
ment, and contrast enhancement. 


10.4.3 Image Enhancement 


Image enhancement uses global processing to 
improve the appearance of the image either 
for human use or for subsequent process- 
ing by computer. The consoles of all vendor 
image viewing platforms and independent 
image-processing workstations provide some 
form of image enhancement. We have already 
mentioned CT windowing. Another tech- 
nique is unsharp masking, in which a blurred, 
or “unsharp,” positive is created to be used as 
a “mask” that is combined with the original 
image, creating the illusion that the resulting 
image is sharper than the original. The tech- 
nique increases local contrast and enhances 
the visibility of fine-detail (high-frequency) 
structures. Histogram equalization spreads 
the image gray levels throughout the vis- 
ible range to maximize the visibility of those 
gray levels that are used frequently. Temporal 
subtraction subtracts a reference image from 
later images that are registered to the first. 
A common use of temporal subtraction is 
digital-subtraction angiography (DSA) in 
which a background image is subtracted from 
an image taken after the injection of contrast 
material. 


10.4.4 Image Rendering/ 


Visualization 


Image rendering and visualization refer to a 
variety of techniques for creating image dis- 
plays, diagrams, or animations to display 
images more in a different perspective from 
the raw images. Image volumes are comprised 
of a stack of 2-D images. If the voxels in each 


image are isotropic, then a variety of arbitrary 
projections can be derived from the volume, 
such as a sagittal or coronal view, or even 
curved planes. A technique called maximum 
intensity projection (MIP) and minimum 
intensity projection (MinIP) can also be cre- 
ated in which imaginary rays are cast through 
the volume, recording the maximum or mini- 
mum intensity encountered along the ray 
path, respectively, and displaying the result as 
a 2-D image. 

In addition to these planar visualizations, 
the volume can be visualized directly in its 
entirety using volume rendering techniques 
(Lichtenbelt et al. 1998) (@ Fig. 10.17) which 
project a two-dimensional image directly from 
a three-dimensional voxel array by casting 
rays from the eye of the observer through the 
volume array to the image plane. Because each 
ray passes through many voxels, some form 
of segmentation (usually simple threshold- 
ing) often is used to remove obscuring struc- 
tures. As workstation memory and processing 
power have advanced, volume rendering has 
become widely used to display all sorts of 
three-dimensional voxel data—ranging from 
cell images produced by confocal microscopy, 
to three-dimensional ultrasound images, to 
brain images created from MRI or PET. 

Volume images can also be given as input to 
image-based techniques for warping the image 
volume of one structure to other. However, 
more commonly the image volume is processed 
in order to extract an explicit spatial (or quan- 
titative) representation of anatomy (> Sect. 
10.4.5). Such an explicit representation permits 
improved visualization, quantitative analysis 
of structure, comparison of anatomy across a 
population, and mapping of functional data. It 
is thus a component of most research involving 
3-D image processing. 


10.4.5 Image Quantitation 


Image quantitation is the process of extracting 
useful numerical parameters or deriving cal- 
culations from the image or from ROIs in the 
image (often as part of “radiomics” analyses, 
described below) (Scheckenbach et al. 2017; 
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Lohmann et al. 2018; Valdora et al. 2018). 
These values are also referred to as “quantita- 
tive imaging features.” These parameters may 
themselves be informative—for example, the 
volume of the heart or the size of the fetus. 
They also may be used as input into an auto- 
mated classification procedure, which deter- 
mines the type of object found. For example, 
small round regions on chest X-ray images 
might be classified as tumors, depending on 
such features as intensity, perimeter, and area. 

Mathematical models often are used in 
conjunction with image quantitation. In 
classic pattern-recognition applications, the 
mathematical model is a classifier (learned 
using some type of supervised machine learn- 
ing) that assigns a label to the image; e.g., to 
indicate if the image contains an abnormal- 
ity, or indicates the diagnosis underlying an 
abnormality. 


= Quantitative Image Features 

Quantitation of images uses global process- 
ing and segmentation to characterize regions 
of interest in the image with numerical values. 
There are two kinds of quantitative image fea- 
tures, pre-defined image features and learned 
features. Pre-defined image features encode 
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isotropic voxels (the dimension of pixels in the x,y plane 
is the same as in the z dimension). The volume is ren- 
dered directly using volume-rendering techniques 


domain knowledge, since they are designed to 
capture specific characteristics of the image or 
image region, such as texture, shapes, lesion 
margin characteristics, and image noise. Pre- 
defined image features are generally designed 
to capture characteristics of the image that 
reflect underlying biology, tissue function, or 
disease. For example, heart size, shape, and 
motion are subtle indicators of heart function 
and of the response of the heart to therapy 
(Clarysse et al. 1997). Similarly, fetal head size 
and femur length, as measured on ultrasound 
images, are valuable indicators of fetal well- 
being (Brinkley 1993). 

Pre-defined image features are quantitative 
representations of visual signals contained in 
an image. Two types of pre-defined image fea- 
tures are photometric features, which exploit 
color and texture cues, derived directly from 
raw pixel intensities, and geometric features, 
which use shape-based cues. While color is 
one of the visual cues often used for content 
description (Hersh et al. 2009), most medi- 
cal images are grayscale. Texture features 
encode spatial organization of pixel values 
of an image region. Shape features describe 
in quantitative terms the contour of a lesion 
and complement the information captured by 
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color or texture (Depeursinge et al. 2014). In 
addition, the histogram of pixel values within 
an ROI or transforms on those values is com- 
monly performed to compute quantitative 
image features. 

Pre-defined image features are com- 
monly represented by feature-vectors in a 
N-dimensional space, where each dimension 
of the feature vector describes an aspect of 
the individual pixel (e.g., color, texture, etc.) 
(Haralick and Shapiro 1992) Image analysis 
tasks that use the quantitative features, such 
as segmentation and classification are then 
approached in terms of distance measure- 
ments between points (samples) in the chosen 
N-dimensional feature space. 

In contrast to pre-defined image features, 
learned features are derived by computa- 
tional analysis of the image itself without 
incorporating any domain knowledge. Deep 
learning methods (described below) are a 
common and popular approach for deriving 
learned features. In deep learning, the goal 
is to learn a task directly from a large col- 
lection of images; the parameters of deep 
learning models reflect image features that 
are extracted during the learning of these 
models (Milletari et al. 2016; Yasaka et al. 
2018). 


a Image Patches 

In the last several years, “patch-based” repre- 
sentations and “bag-of-features” classification 
techniques have been proposed and used as an 
approach to processing image contents (Jurie 
and Triggs 2005; Nowak et al. 2006; Avni 
2009). An overview of the methodology is 
shown in @ Fig. 10.18, and represents one of 
the types of “feature learning” being used for 
automated computer analysis of images (the 
other type of feature learning, deep learning, is 
discussed below). In image patch approaches, 
a shift is made from the pixel as being the 
atomic entity of analysis to a “patch” — a small 
window centered on the pixel, thus region- 
based information is included. A very large set 
of patches is extracted from an image. Each 
small patch shows a localized “glimpse” at the 
image content; the collection of thousands 
and more such patches, randomly selected, 


have the capability to identify the entire image 
content (similar to a puzzle being formed 
from its pieces). 

Patch extraction approaches include using 
a regular sampling grid, a random selection 
of points, or the selection of points with 
high information content using salient point 
detectors, such as SIFT (Lowe 1999). Once 
patches are selected, the information content 
within a patch is extracted. It is possible to 
take the patch information as a collection of 
pixel values, or to shift the representation to 
a different set of features based on the pix- 
els, such as SIFT features. A final step in the 
process is to learn a dictionary of words over 
a large collection of patches, extracted from 
a large set of images. The vector represented 
patches are converted into “visual words” 
which form a representative “dictionary”. 
A visual word can be considered as a rep- 
resentative of several similar patches. A fre- 
quently-used method is to perform K-means 
clustering (Bishop 1995) over the vectors of 
the initial collection, and then cluster them 
into K groups in the feature space. The resul- 
tant cluster centers serve as a vocabulary of 
K visual words, with K often in the hundreds 
and thousands. 

Once a global dictionary is learned, each 
image is represented as a collection of words 
(also known as a “bag of words”, or “bag of 
features”), using an indexed histogram over 
the defined words. Various image processing 
tasks can then be undertaken, ranging from 
the categorization of the image content, giving 
the image a “high-level,” more semantic label, 
the matching between images, or between an 
image and an image class, using patches for 
image segmentation and region-of-interest 
detection within an image. For these various 
tasks, images are compared using a distance 
measure between the representative histo- 
grams. In categorizing an image as belonging 
to a certain image class, well-known classifi- 
ers, such as the k- nearest neighbor and sup- 
port-vector machines (SVM) (Vapnik 2000), 
can be used. 

Using patches in a bag-of-visual-words 
(BoW) representation was shown to be suc- 
cessful in general scene and object recognition 
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image representation. A radiographic image is shown 
with a set of patches indicated for processing the image 
data. Subsequent image processing is performed on each 
patch, and on the entire set of patches, rather than on 


tasks (Fei-Fei and Perona 2003, Varma and 
Zisserman 2003, Sivic and Zisserman 2003, 
Nowak et al. 2006, Jiang et al. 2007). A few 
research studies were conducted in the medical 
domain as well. For example, in (Andre et al. 
2009) BoW was used as the representation of 
endomicroscopic images and achieved high 
accuracy in the tasks of classifying the images 
into neoplastic (pathological) and benign. In 
(Bosch et al. 2006) an application to texture 
representation for mammography tissue classi- 
fication and segmentation was presented. The 
use of BoW techniques for large scale radio- 
graph archive categorization can be found in 
the ImageCLEF competition, in a task to clas- 
sify over 12,000 X-ray images to 196 different 
(organ-level) categories (Tommasi et al. 2010). 
This competition provides an important 
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individual pixels in the image. A dictionary of visual 
words is learned from a large set of images, and their 
respective patches. Further analysis of the image con- 
tent can then be pursued based on a histogram across 
the dictionary words 


benchmarking tool to assess different feature 
sets as well as classification schemes on large 
archives of Radiographs. For several years, 
approaches based on local patch representa- 
tion achieved the highest scores for categoriza- 
tion accuracy (Deselaers et al. 2006; Caputo 
et al. 2008; Greenspan et al. 2011). 


a Radiomics, Machine Learning and Deep 
Learning 

Radiomics describes a broad set of com- 
putational methods that extract quantita- 
tive features from radiology images (though 
similar approaches can be applied to other 
image types like histopathology or ophthal- 
mology images) (Kumar et al. 2012; Lambin 
et al. 2012; Grossmann et al. 2017). The term 
“radiomics” has been used to mean a variety 
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of concepts, being wide in scope to include 
several fields, including clinical radiology (eg, 
imaging interpretation), computer vision (eg, 
quantitative feature extraction), and machine 
learning (eg, classifier evaluation). In recent 
years, radiomics is commonly used to refer to 
the quantitation of image features in large col- 
lections of images, akin to the use of “omics” 
for large scale collection and analysis of other 
types of biomedical data. It includes all the 
quantification techniques described in this 
Chap.. A more distinctive and unique formal- 
ism for the term is to view the quantitation 
that it represents as one that focuses on the 
identification of quantitative imaging indica- 
tors that predict important clinical outcomes, 
e.g. prognosis and response or resistance to a 
specific cancer treatment (Zhou et al. 2018). 

One motivation of radiomics is informa- 
tion integration—to merge image features 
with other known quantitative descriptors, 
including patient information and genomic 
data to generate a unique patient signature (or 
electronic phenotype). Initially, pre-defined 
image features were extracted, including for 
example a large number of quantifications 
defined from texture features, SIFT features, 
etc. (Napel et al. 2010) Once machine learning 
tools, specifically deep learning tools emerged, 
the latter have begun to take over several of 
the stages within the radiomics processing 
cycle - specifically, the generation of large sets 
of automatically extracted features for the 
quantification of the data, since deep learn- 
ing models learn image features as part of the 
training process (Kontos et al. 2017; Giger 
2018). 

Radiomics relies on computational tech- 
niques in computer vision to extract many 
quantitative features from radiologic images. 
The extracted quantitative features are typi- 
cally within a defined ROI that could include 
the whole tumor or specific regions within it. 
Computational image descriptors quantify 
visual characteristics at different scales from 
ROIs. For example, the scale-invariant feature 
transform (SIFT) (Lowe 1999) is computed 
through key point detection using a differ- 
ence of Gaussian function and local image 
gradient measurement with radius and scale 


selections. This permits a quantitative mea- 
surement of the tumor shape so that subtle 
variations during treatment can be observed 
and quantified. Local-level feature extraction 
provides an image descriptor used to compare 
a pixel being tested with its immediate pixel 
neighborhood. This allows identification of a 
small area within an otherwise homogeneous, 
larger tumor region. This can be achieved, for 
example, with local binary patterns (LBP). 
These are local image descriptors sensitive to 
small monotonic gray-level differences (Ojala 
et al. 2002). Texture descriptors, such as the 
LBP are very common in ROI descriptions, 
including gray-level co-occurrence matrices 
(Haralick et al. 1973) that examine the spatial 
relationships of pixels through a series of sta- 
tistical measures, and histogram of oriented 
gradients (HOG) (Dalal and Triggs 2005) 
features to quantify image-gradient statistics 
with multiple directions not obvious to radi- 
ologists. 

Machine Learning is commonly used for 
discovering predictive radiomics features. 
In machine learning, the parameter space 
is searched for an imaging feature statisti- 
cally associated with clinical outcome (Zhou 
et al. 2018). Before one evaluates machine- 
learning models, a specification for the medi- 
cal diagnostic task is needed so that models 
can be appropriately trained. For example, 
supervised, unsupervised, and semisuper- 
vised learning models are fundamental 
learning strategies used in accordance with 
the different levels of available clinical out- 
come labels. 

In supervised learning, the goal is to learn 
from a certain portion of trained samples 
with known class labels and to predict classes 
or numeric values for unknown patterns from 
large and noisy datasets. Conversely, unsuper- 
vised learning finds the natural structure from 
data without having any prior labels. As a 
hybrid setting, semisupervised learning needs 
only a small portion of labeled training data. 
The unlabeled data samples, 

instead of being discarded, are also used in 
the learning process. 

Deep Learning as a new frontier in machine 
learning that is quickly rising as a primary 
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O Fig. 10.19 Above is a simply 5-layer fully convolu- 
tional network. The first convolutional block is sepa- 
rated out into three components. The first is the actual 
2-dimensional convolution. Different convolutional 
weights act on the input tensor to create the output ten- 
sor, each weight acting as a feature extractor. This is fol- 
lowed by a nonlinearity. Common nonlinearities include 
the sigmoid, hyperbolic tangent, and the rectified linear 


approach to many image analysis problems, 
and it has advanced large-scale medical image 
analysis. (Greenspan et al. 2016; Shen et al. 
2017). Excitement for using deep learning for 
attacking many image analysis problems in 
medical imaging has grown quickly because 
these methods were the first to be top per- 
forming methods in the ImageNet classifica- 
tion challenge (Krizhevsky et al. 2012), and 
much of medical imaging analysis is an image 
classification problem. The development of 
deep learning, as part of the machine-learning 
field, provided a new approach in which the 
input data is automatically quantified while 
being analyzed. 

Deep learning has been termed one of 
the 10 breakthrough technologies as of 2013 
(MIT Technology Review 2013). It is an 
improvement of artificial neural networks, 
architectures of computational units (“neu- 
rons”), which are designed in several (all the 
way to thousands) of layers (“deep”) — where 
it was found that more layers permit higher 
levels of abstraction and improved predic- 
tions from data (LeCun et al. 2015). To date, 
it is emerging as the leading machine-learning 
tool in the general imaging and computer 
vision domains. 
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unit, or ReLU. Finally, should we need to conserve some 
memory, we can use pooling operations, such as a Max- 
Pool layer to decrease the size of our tensors. This fun- 
damental combination of convolutions, nonlinearities, 
and pooling operations is used in every convolutional 
neural network since its rise to prominence in 2012. 
(Figure courtesy of Darvin Yi) 


Deep learning methods have achieved 
record-breaking performances for numerous 
computer vision applications when the num- 
ber of available training samples is sufficiently 
large (Deng et al. 2009). 

Among the network architectures and 
models, convolutional neural networks 
(CNNs) have proven to be powerful tools for 
a broad range of computer vision tasks. The 
typical CNN architecture for image process- 
ing consists of a series of layers of convolu- 
tion filters, followed by or interspersed with 
a series of data reduction or pooling layers 
(O Fig. 10.19). The convolution filters are 
applied to small patches of the input image. 
Like the low-level vision processing in the 
human brain, the convolution filters detect 
increasingly more relevant image features, for 
example lines or circles that may represent 
straight edges (such as for organ detection) or 
circles (such as for round objects like colonic 
polyps), and then higher order features like 
local and global shape and texture. The out- 
put of the CNN is typically one or more 
probabilities or class labels. The convolution 
filters are learned from training data. This is 
desirable because it reduces the necessity of 
the time-consuming hand-crafting of features 


10 


332 D.L.Rubin et al. 


O Fig. 10.20 Image segmentation. This figure illus- 
trates the process of segmenting and labeling the cham- 
bers of the heart. On the left, a cross sectional atlas 
image of the heart has been segmented by hand and 
each chamber was labeled (RAA right atrial appendage, 


that would otherwise be required to pre-pro- 
cess the images with application-specific filters 
or by calculating computable features. 

Deep CNNs automatically learn mid-level 
and high-level abstractions obtained from raw 
data (e.g., images). Recent results indicate that 
the generic descriptors extracted from CNNs 
are extremely effective in object recognition 
and localization in natural images. In the 
medical imaging domain, in many detection, 
classification and segmentation tasks, deep 
learning has proved to be the state-of-the-art 
foundation, leading to improved accuracy. It 
has also opened new frontiers in data analy- 
sis with rates of progress not before experi- 
enced. For an overview on deep learning for 
medical image quantification and analysis see 
(Greenspan et al. 2016; Zhou et al. 2017a). 


10.4.6 Image Segmentation 


Segmentation of images involves automati- 
cally circumscribing regions within an image 
to generate ROIs in the image. The ROIs usu- 
ally correspond to anatomically meaningful 
structures, such as organs or parts of organs, 
or they may be lesions or other types of regions 
in the image pertinent to the application. The 
structures may be delineated by their borders, 
in which case edge-detection techniques (such 


RA right atrium, LA left atrium, RV right ventricle, LV 
left ventricle). The boundary of each circumscribed ana- 
tomic region can be converted into a digital mask (right) 
which can be used in different applications where label- 
ing anatomic structures in the image is needed 


as edge-following algorithms) are used, or 
by their composition in the image, in which 
case region-detection techniques (such as tex- 
ture analysis) are used (Haralick and Shapiro 
1992). Neither of these techniques has been 
completely successful as fully automated 
image segmentation methods; regions often 
have discontinuous borders or nondistinctive 
internal composition. Furthermore, cont- 
iguous regions often overlap. These and other 
complications make segmentation the most 
difficult subtask of the medical image pro- 
cessing problem. Because segmentation is dif- 
ficult for a computer, it is usually performed 
either by hand or in a semi-automated man- 
ner with assistance by a human through oper- 
ator-interactive approaches (@ Fig. 10.20). 
In both cases, segmentation is time intensive, 
and it therefore remains a major bottleneck 
that prevents more widespread application of 
Image processing techniques. 

A great deal of progress has been made 
in automated segmentation in the brain, par- 
tially because the anatomic structures tend 
to be reproducibly positioned across subjects 
and the contrast delineation among struc- 
tures is often good. In addition, MRI images 
of brain tend to be high quality. Several 
software packages are currently available 
for automatic segmentation, particularly for 
normal macroscopic brain anatomy in cor- 
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tical and sub-cortical regions (Collins et al. 
1995; Friston et al. 1995; Subramaniam et al. 
1997; Dale et al. 1999; MacDonald et al. 
2000; Brain Innovation B.V. 2001; FMRIDB 
Image Analysis Group 2001; Van Essen et al. 
2001; Hinshaw et al. 2002). The Human 
Brain Project’s Internet Brain Segmentation 
Repository (Kennedy 2001) has been develop- 
ing a repository of segmented brain images to 
use in comparing these different methods. 

Popular segmentation techniques can be 
(1) region-based Vs. edge-based methods, (2) 
knowledge-based Vs. data-driven methods, 
and combined methods. 


= Region-Based Vs. Edge-Based 

In region-based segmentation, voxels are 
grouped into contiguous regions based on 
characteristics such as intensity ranges, spa- 
tial statistics and similarity to neighboring 
voxels (Shapiro and Stockman 2001; Li et al. 
2011b). In brain MR images, a common 
class separation is into: gray matter, white 
matter, cerebrospinal fluid and background. 
One then uses these classifications as a basis 
for further segmentation (Choi et al. 1991; 
Zijdenbos et al. 1996). Another region-based 
approach is called region-growing, in which 
regions are grown from seed voxels manu- 
ally or automatically placed within candi- 
date regions (Davatzikos and Bryan 1996; 
Modayur et al. 1997). The regions found by 
any of these approaches are often further pro- 
cessed by mathematical morphology opera- 
tors (Haralick 1988) to remove unwanted 
connections and holes (Sandor and Leahy 
1997). Other well-known techniques include 
active contour and level set models (Li et al. 
2011b; Hoogi et al. 2017) graph-based models 
(Shattuck and Leahy 2001) and clustering- 
based methods (Li et al. 201 1a). 

Contrary to region-based techniques, 
Edge-based segmentation relies on detecting 
the gradients in the image. These gradients are 
considered as the organ boundary. However, 
edge-based technique is very sensitive to image 
noise and to inconsistent broken boundaries. 

Other techniques can be considered as 
hybrid frameworks, in which both region sta- 
tistics and gradients information are included 
(Chakraborty et al. 1996; Shao et al. 2008). 
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All the above techniques are essentially low- 
level techniques that only look at local or 
global regions in the image data. 


= Model-Based and Data-Driven 
Segmentation 

Segmentation methods are mostly divided 
into model-based and data-driven approaches. 
The former considers prior knowledge about 
the organ/medical images to be analyzed, 
while the latter is based only on the specific 
analyzed image data, with no prior exam- 
ples or knowledge. Deformable models that 
are part of the model-based curve evolution 
approach and called “Snakes” (Kass et al. 
1987; Davatzikos and Bryan 1996; Dale et al. 
1999; MacDonald et al. 2000; Van Essen et al. 
2001). These models can include knowledge of 
the expected anatomy of the organ. For exam- 
ple, the cost function employed in the method 
developed by MacDonald (MacDonald et al. 
2000) includes a term for the expected thick- 
ness of the brain cortex. Thus, these meth- 
ods can become somewhat knowledge-based, 
where knowledge of anatomy is encoded in 
the cost function. Level set is another form 
of curve evolution technique but on contrary 
to Snake, level set is an implicit approach (Li 
et al. 2011b; Hoo-Chang et al. 2016; Hoogi 
et al. 2017). In both Snakes and level set, the 
contour is deformed according to a cost func- 
tion that should be minimized and includes 
both intrinsic terms regarding the contour 
itself (e.g. contour smoothness), and extrinsic 
terms that depends on the image data. 


= Clustering-Based Segmentation 

The core operation in a segmentation task is 
the division of the image into a finite set of 
clusters/regions with similar statistics, which 
are smooth and homogeneous in their content 
and their representation. When posed in this 
way, segmentation can be regarded as a prob- 
lem of finding clusters in a selected feature 
space. The segmentation task can be seen as 
a combination of two main processes: (a) The 
generation of an image representation over 
a selected feature space. This can be termed 
the modeling stage. The model components 
are often viewed as groups, or clusters in the 
high-dimensional space. (b) The assignment 
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of pixels to one of the model components or 
segments. In order to be directly relevant for 
a segmentation task, the clusters in the model 
should represent homogeneous regions of the 
image. In general, the better the image mod- 
eling, the better the segmentation produced. 
Since the number of clusters in the feature 
space is often unknown, segmentation can be 
regarded as an unsupervised clustering task in 
the high-dimensional feature space. 

There are many works on clustering algo- 
rithms. We can categorize them into three 
broad classes: (a) deterministic algorithms, 
(b) probabilistic model-based algorithms, 
and (c) graph-theoretic algorithms. The sim- 
plest of these are the deterministic algorithms 
such as k-means (Bishop 1995), mean-shift 
(Comaniciu and Meer 2002), and agglomera- 
tive methods (Duda et al. 2001). For certain 
data distributions, i.e., distributions of pixel 
feature vectors in a feature space, such algo- 
rithms perform well. For example, k-means 
provides good results when the data are convex 
or blob-like and the agglomerative approach 
succeeds when clusters are dense and there is 
no noise. These algorithms, however, have a 
difficult time handling more complex struc- 
tures in the data. The probabilistic algorithms, 
on the other hand, model the distribution in 
the data using parametric models (McLachlan 
and Peel 2000). Such models include auto- 
regressive (AR) models, Gaussian mixture 
models (GMM), Markov random fields 
(MRF), conditional random fields, and oth- 
ers. Efficient ways of estimating these models 
are available using maximum likelihood algo- 
rithms such as the Expectation-maximization 
(EM) algorithm (Dempster et al. 1977). While 
probabilistic models offer a principled way to 
explain the structures present in the data, they 
could be restrictive when more complex struc- 
tures are present. 

Another type of clustering algorithms is 
non-parametric in that this class imposes no 
prior shape or structure on the data. Examples 
of these are graph-theoretic algorithms based 
on spectral factorization (e.g., (Ng et al. 2001, 
Shi and Malik 2000)). Here, the image data 
are modeled as a graph. The entire image 
data along with a global cost function are 
used to partition the graph, with each parti- 


tion now becoming an image segment. In this 
approach, global considerations determine 
localized decisions. Moreover, such optimiza- 
tion procedures are often compute-intensive. 

Consider an example application in brain 
image segmentation using parametric mod- 
eling and clustering. The tissue and lesion 
segmentation problem in Brain MRI is a 
well-studied topic of research. In such images, 
there is interest in three main tissue types: 
white matter (WM), gray matter (GM) and 
cerebro-spinal fluid (CSF). The volumetric 
analysis of such tissue types in various part 
of the brain is useful in assessing the prog- 
ress or remission of various diseases, such 
as Alzheimer’s disease, epilepsy, sclerosis 
and schizophrenia. A segmentation example 
is shown in @ Fig. 10.21. In this example, 
images from 3 MRI imaging sequences are 
input to the system, and the output is a seg- 
mentation map, with different colors repre- 
senting three different normal brain tissues, as 
well as a separate color to indicate regions of 
abnormality (multiple-sclerosis lesions). 

Various approaches to the segmenta- 
tion task are reviewed in (Pham et al. 2000). 
Among the approaches used are pixel-level 
intensity based clustering, such as K-means 
and Mixture of Gaussians modeling (e.g., 
(Kapur et al. 1996)). In this approach, the 
intensity feature is modeled by a mixture of 
Gaussians, where each Gaussian is assigned a 
semantic meaning, such as one of the tissue 
regions (or lesion). Using pattern recognition 
methods and learning, the Gaussians can be 
automatically extracted from the data, and 
once defined, the image can be segmented into 
the respective regions. 

Algorithms for tissue segmentation using 
pixel-level intensity-based classification often 
exhibit high sensitivity to various noise arti- 
facts, such as intra-tissue noise, inter-tissue 
intensity contrast reduction, partial-volume 
effects and others. Due to the artifacts pres- 
ent, classical voxel-wise intensity-based clas- 
sification methods, including the K-means 
modeling and Mixture of Gaussians model- 
ing, often give unrealistic results, with tissue 
class regions appearing granular, fragmented, 
or violating anatomical constraints. Specific 
works can be found addressing various aspects 
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O Fig. 10.21 Brain MRI segmentation example. Brain 
slice from multiple acquisition sequences (with 9% 
noise) was taken from BrainWEB (> http://www.bic. 
mni.mcgill.ca/brainweb/). From left to right: T1-, T2-, 
and proton density (PD)-weighted image. Segmentation 


of these concerns (e.g., partial-volume effect 
quantification (Dugas-Phocion et al. 2004)). 
One way to address the smoothness issue is 
to add spatial constraints. This is often done 
during a pre-processing phase by using a sta- 
tistical atlas, or as a post-processing step via 
Markov Random Field models. A statistical 
atlas provides the prior probability for each 
pixel to originate from a particular tissue class 
(e.g., (Van Leemput et al. 1999; Marroquin 
et al. 2002; Prastawa et al. 2004)). 

Algorithms exist that use the maximum- 
a-posteriori (MAP) criterion to augment 
intensity information with the atlas. However, 
registration between a given image and the 
atlas is required, which can be computation- 
ally prohibitive (Rohlfing and Maurer Jr. 
2003). Further, the quality of the registration 
result is strongly dependent on the physiologi- 
cal variability of the subject and may converge 
to an erroneous result in the case of a diseased 
or severely damaged brain. Finally, the regis- 
tration process is applicable only to complete 
volumes. A single slice cannot be registered 
to the atlas. Therefore it cannot be segmented 
using these state-of-the-art algorithms. In 
(Greenspan et al. 2006) a robust, unsuper- 
vised, parametric method for segmenting 
3D (or 2D) MR brain images with a high 
degree of noise and low contrast, is presented. 
A Constrained Gaussian Mixture Model 
(CGMM) framework is proposed, in which 
each tissue is modeled with multiple four- 
dimensional Gaussians, where each Gaussian 
represents a localized region (3 spatial fea- 
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of the images is shown on the right: Blue: CSF; Green: 
Gray matter (GM); Yellow: white matter (WM); Red: 
Multiple-sclerosis lesions (MSL). (Friefeld et al. 2009). 
(Reused with permission © Brainweb) 


tures) and the intensity characteristic per 
region (T1 intensity feature). Incorporating 
the spatial information within the feature 
space is novel, as is using a large number of 
Gaussians per brain tissue to capture the 
complicated spatial layout of the individual 
tissues. Two key features of the proposed 
framework are: 1) combining global intensity 
modeling with localized spatial modeling, as 
an alternative scheme to MRF modeling, and 
2) segmentation is entirely unsupervised; thus 
eliminating the need for atlas registration, or 
any intensity model standardization. 

Segmentation can also be improved using 
a post-processing phase in which smoothness 
and immunity to noise can be achieved by 
modeling the interactions among neighbor- 
ing voxels. Such interactions can be modeled 
using a Markov Random Field (MRF), and 
thus this technique has been used to improve 
segmentation (Held et al. 1997; Van Leemput 
et al. 1999; Zhang et al. 2001). 


= Segmentation Using Deep Learning 

As noted earlier in this chapter, one of the fast- 
est emerging research field over the last few 
years is deep learning. Deep learning can help 
outperform classical machine learning algo- 
rithms due to its ability to learn latent vari- 
ables within the features space, features that 
the user can barely detect. On the other hand, 
deep learning requires a huge labeled training 
size that is not always available, which makes 
developing robust classification models chal- 
lenging. However, there are methods to help 
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overcome these challenges, such as data aug- 
mentation (Greenspan et al. 2016), transfer 
learning (Hoo-Chang et al. 2016), and metric 
learning (Yang et al. 2010) that were specifi- 
cally designed to handle these challenges. In 
transfer learning one uses a network that was 
pre-trained on another set of images, and we 
use the weights of this network as weights 
initialization for the current network that is 
analyzed. The initial weights are fine-tuned on 
the relevant current dataset, which should give 
better results than just using random weights. 
In metric learning, one actually learns the best 
metric that can represent and classify the data — 
instead of learning the classes themselves. In 
that way, we actually teach the network how to 
learn. In addition, metric learning is a kind of 
“image ontology,” and as a result, it sketches 
the distances between different instances. 
Therefore, if a new test case will not be part of 
the training classes, it will not be misclassified 
to one of those classes. 

The core idea of deep learning is to con- 
volve the input image with different filters and 
within different scales (i.e. pooling), such as 
they will able to detect both low-level and high- 
level features. Many deep learning architectures 
are considered as patch-wise techniques. 

Convolutional neural networks such as 
U-Net (Ronneberger et al. 2015; Trebeschi 
et al. 2017) and V-Net (Trebeschi et al. 2017) 
were designed specifically to deal with the typ- 
ical challenges of the medical domain such as 
small amount of labeled data. Autoencoders, 
Variational Autoencoders and stacked- 
autoencoders can be used for image denois- 
ing and as for unsupervised feature extraction 
(Vincent et al. 2010; Bengio et al. 2013). Other 
methods were designed to handle with various 
of classifications tasks such as lesion detection 
(Wang et al. 2016), segmentation (Kayalibay 
et al. 2017; Trebeschi et al. 2017) and disease 
classification (Esteva et al. 2017). 


10.4.7 Image Registration 


The growing availability of 3-D and higher 
dimensionality structural and functional 
images leads to exciting opportunities for 
realistically observing the structure and func- 


tion of the body. These opportunities are par- 
ticularly widely exploited in brain imaging. 
Therefore, this section concentrates on 3-D 
brain imaging, with the recognition that many 
of the methods developed for the brain have 
been or will be applied to other areas as well. 

The basic 2-D image processing opera- 
tions of global processing, segmentation, fea- 
ture detection, and classification generalize 
to higher dimensions, and are usually part of 
any image processing application. However, 
3-D and higher dimensionality images give 
rise to additional informatics issues, which 
include image registration (which also occurs 
to a lesser extent in 2-D), spatial representa- 
tion of anatomy, symbolic representation of 
anatomy, integration of spatial and symbolic 
anatomic representations in atlases, anatomi- 
cal variation, and characterization of anatomy. 
All but the first of these issues deal primar- 
ily with anatomical structure, and therefore 
could be considered part of the field of struc- 
tural informatics. They could also be thought 
of as being part of imaging informatics and 
neuroinformatics. 

As noted previously, 3-D image volume 
data are represented in the computer by a 
3-D volume array, in which each voxel repre- 
sents the image intensity in a small volume of 
space. In order to depict anatomy accurately, 
the voxels must be accurately registered (or 
located) in the 3-D volume (voxel registra- 
tion), and separately acquired image volumes 
from the same subject must be registered with 
each other (volume registration). 


= Voxel Registration 
Imaging modalities such as CT, MRI, and con- 
focal microscopy (> Sects. 10.2.3 and 10.2.5) 
are inherently 3-D: the scanner generally out- 
puts a series of image slices that can easily be 
reformatted as a 3-D volume array, often fol- 
lowing alignment algorithms that compensate 
for any patient motion during the scanning 
procedure. For this reason, almost all CT and 
MR manufacturers’ consoles contain some 
form of three-dimensional reconstruction and 
visualization capabilities. 

As noted in »> Sect. 10.4.4, two- 
dimensional images can be converted to 3-D 
volumes if they are closely spaced parallel 
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sections through a tissue or whole specimen 
and contain isotropic voxels. In this case, the 
problem is how to align the sections with 
each other. For whole sections (either frozen 
or fixed), the standard method is to embed a 
set of thin rods or strings in the tissue prior 
to sectioning to manually indicate the loca- 
tion of these fiducials on each section, then to 
linearly transform each slice so that the cor- 
responding fiducials line up in 3-D (Prothero 
and Prothero 1986). An example of this tech- 
nique is the Visible Human, in which a series 
of transverse slices were acquired, then recon- 
structed to give a full 3-D volume (Spitzer and 
Whitlock 1998) (> Chap. 22). 

It is difficult to embed fiducial markers 
at the microscopic level, so intrinsic tissue 
landmarks are often used as fiducials, but the 
basic principle is similar. However, in this case 
tissue distortion may be a problem, so non- 
linear transformations may be required. For 
example Fiala and Harris (Fiala and Harris 
2001) developed an interface that allows the 
user to indicate, on electron microscopy sec- 
tions, corresponding centers of small organ- 
elles such as mitochondria. A non-linear 
transformation (warp) is then computed to 
bring the landmarks into registration. 

An approach being pursued (among 
other approaches) by the National Center for 
Microscopy and Imaging Research (» http:// 
ncmir.ucsd.edu/) combines reconstruction 
from thick serial sections with electron tomog- 
raphy (Soto et al. 1994). In this case the tomo- 
graphic technique is applied to each thick 
section to generate a 3-D digital slab, after 
which the slabs are aligned with each other 
to generate a 3-D volume. The advantages of 
this approach over the standard serial section 
method are that the sections do not need to be 
as thin, and fewer of them need be acquired. 

An alternative approach to 3-D voxel reg- 
istration from 2-D images is stereo-matching, 
a technique developed in computer vision that 
acquires multiple 2-D images from known 
angles, finds corresponding points on the 
images, and uses the correspondences and 
known camera angles to compute 3-D coor- 
dinates of pixels in the matched images. The 
technique is being applied to the reconstruc- 
tion of synapses from electron micrographs 
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by a Human Brain Project collaboration 
between computer scientists and biologists at 
the University of Maryland (Agrawal et al. 
2000). 


= Volume Registration 

A related problem to that of aligning individ- 
ual sections is the problem of aligning sepa- 
rate image volumes from the same subject, 
that is, intra-subject alignment. Because differ- 
ent image modalities provide complementary 
information, it is common to acquire more 
than one kind of image volume on the same 
individual. This approach has been particu- 
larly useful for brain imaging because each 
modality provides different information. For 
example, PET (» Sect. 10.2.3) provides use- 
ful information about function, but does not 
provide good localization with respect to the 
anatomy. Similarly, MRV and MRA (> Sect. 
10.2.3) show blood flow but do not pro- 
vide the detailed anatomy visible with stan- 
dard MRI. By combining images from these 
modalities with MRI, it is possible to show 
functional images in terms of the underlying 
anatomy, thereby providing a common neuro- 
anatomic framework. 

The primary problem to solve in multimo- 
dality image fusion is volume registration— 
that is, the alignment of separately acquired 
image volumes. In the simplest case, separate 
image volumes are acquired during a single 
sitting. The patient’s head may be immo- 
bilized, and the information in the image 
headers may be used to rotate and resample 
the image volumes until all the voxels corre- 
spond. However, if the patient moves, or if 
examinations are acquired at different times, 
other registration methods are needed. When 
intensity values are similar across modalities, 
registration can be performed automatically 
by intensity-based optimization methods 
(Woods et al. 1992; Collins et al. 1994). When 
intensity values are not similar (as is the case 
with MRA, MRV and MRI), images can be 
aligned to templates of the same modalities 
that are already aligned (Woods et al. 1993; 
Ashburner and Friston 1997). Alternatively, 
landmark-based methods can be used. The 
landmark-based methods are similar to those 
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used to align serial sections (see earlier dis- 
cussion of voxel registration in this section), 
but in this case the landmarks are 3-D points. 
The Montreal Register Program (MacDonald 
1993) is an example of such a program. 
Techniques and applications of volume regis- 
tration in other domains have been described 
(Pelizzari 1998; Ferrante and Paragios 2017). 


10.5 Image Interpretation 
and Computer Reasoning 


The preceding sections of this chapter as well 
as > Chap. 22 describe informatics aspects of 
image generation, storage, manipulation, and 
display of images. Rendering an interpretation 
is a crucial final stage in the chain of activi- 
ties related to imaging. Image interpretation 
is this final stage in which the physician has 
direct impact on the clinical care process, by 
rendering a professional opinion as to whether 
abnormalities are present in the image and the 
likely significance of those abnormalities. The 
process of image interpretation requires “rea- 
soning”— drawing inferences from facts; the 
facts are the image abnormalities detected and 
the known clinical history, and the inferred 
information is the diagnosis and management 
decision (what to do next, such as another test 
or surgery, etc.). Such reasoning usually entails 
uncertainty, and optimally would be carried 
out using probabilistic approaches (> Chap. 
3), unless certain classic imaging patterns are 
recognized. In reality, radiology practice is 
usually carried out without formal probabi- 
listic models that relate imaging observations 
to the likelihood of diseases. However, varia- 
tion in practice is a known problem in image 
interpretation (Robinson 1997), and methods 
to improve this process are desirable. 
Informatics methods can enhance radio- 
logical interpretation of images in two major 
ways: (1) image retrieval systems and (2) 
computer-based inference systems. The con- 
cept of image retrieval is similar to that of 
information retrieval (see » Chap. 23), in 
which the user retrieves a set of documents 
pertinent to a question or information need. 
The information being sought when doing 


image retrieval is images with specific con- 
tent—typically to find images that are simi- 
lar in some ways to a query image (e.g., to 
find images in the PACS containing similar- 
appearing abnormalities to that in an image 
being interpreted). Finding images containing 
similar content is referred to as content based 
image retrieval (CBIR). By retrieving similar 
images and then looking at the diagnosis of 
those patients, the radiologist can gain greater 
confidence in interpreting the images from 
patients whose diagnosis is not yet known. 

As with the task of medical diagnosis 
(> Chap. 24), radiological diagnosis can be 
enhanced using computer-based inference 
systems, the commonest type of which is deci- 
sion support systems, which assist the physi- 
cian in making clinical decisions. In computer 
inference (also referred to as “reasoning”), the 
machine takes in the available data (the images 
and possibly other clinical information), per- 
forms a variety of image processing methods 
(> Sect. 10.4), and uses one or more types of 
knowledge resources and/or mathematical 
models to render an output comprising either 
a decision or a ranked list of possible choices 
(e.g., diagnoses or locations on the image sus- 
pected of being abnormal). 

In this section we describe informatics 
methods for image retrieval and computer 
inference with images. 


10.5.1 Content-Based Image 


Retrieval 


Since a key aspect of radiological inter- 
pretation is recognizing characteristic patt- 
erns in the imaging features which suggest 
the diagnosis, searching databases for simi- 
lar images with known diagnoses could be 
an effective strategy to improving diagnostic 
accuracy. CBIR is the process of performing a 
match between images using their visual con- 
tent. A query image can be presented as input 
to the system (or a combination of a query 
image and the patient’s clinical record), and 
the system searches for similar cases in large 
archive settings (such as PACS) and returns 
a ranked list of such similar data (images). 
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This task requires an informative representa- 
tion for the image data, along with similarity 
measures across image data. CBIR methods 
are already useful in non-medical applica- 
tions such as consumer imaging and on the 
Web (Wang et al. 1997; Smeulders et al. 2000; 
Datta et al. 2008). 

There has also been ongoing work to 
develop CBIR methods in radiology and sev- 
eral reviews on this subject have been pub- 
lished (Akgul et al. 2011; Endo et al. 2012; 
Kumar et al. 2013; Muramatsu 2018). The 
approach generally is based on deriving quan- 
titative characteristics from the images (e.g., 
pixel statistics, spatial frequency content, 
etc.; » Sect. 10.4.5), followed by applica- 
tion of similarity metrics to search databases 
for similar images (Lehmann et al. 2004; 
Muller et al. 2004; Greenspan and Pinhas 
2007; Datta et al. 2008; Deserno et al. 2009; 
Napel et al. 2010; Faruque et al. 2013, 2015). 
The focus of the current work is on entire 
images, describing them with sets of numeri- 
cal features, with the goal of retrieving similar 
images from medical collections (Hersh et al. 
2009; Napel et al. 2010; Faruque et al. 2013, 
2015) that provide benchmarks for image 
retrieval. However, in many cases only a par- 
ticular region of the image is of interest when 
seeking similar images (e.g., finding images 
containing similar-appearing lesions to those 
in the query image). More recently, “local- 
ized” CBIR methods are being developed in 
which a part of the image containing a region 
of interest is analyzed (Deselaers et al. 2007; 
Rahmani et al. 2008; Napel et al. 2010). 

There are several unsolved challenges in 
CBIR. First, CBIR has been largely focused 
on query based on single 2-D images; methods 
need to be developed for 3D retrieval in which 
a volume is the query “image.” A second chal- 
lenge is the need to integrate images with non- 
image clinical data to permit retrieval based 
on entire patient cases and not single images 
(e.g., the CBIR method should take into con- 
sideration the clinical history in addition to 
the image appearance in retrieving a similar 
“case”). 

Another limitation of current CBIR is that 
image semantics is not routinely included. 
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The qualitative image features reported by the 
radiologist (“semantic features”) are comple- 
mentary to the quantitative data contained 
in image pixels. One approach to capturing 
image semantics is analyzing and processing 
“visual words” in images, captured as image 
patches or codebooks (» Sect. 10.4.5). These 
techniques have been shown to perform well 
in CBIR applications (Qiu 2002). Another 
approach to capture image semantics is to 
use the radiologist’s imaging observations as 
image features. Several studies have found 
that combining the semantic information 
obtained from radiologists’ imaging reports 
or annotations with the pixel-level features 
can enhance performance of CBIR systems 
(Ruiz 2006; Zhenyu et al. 2009; Napel et al. 
2010). The knowledge representation meth- 
ods described in » Sects. 10.3.2 and 10.4.5 
make it possible to combine these types of 
information. 


10.5.2 Computer-Based Inference 


Though image retrieval described above (and 
information retrieval in general) can be help- 
ful to a practitioner interpreting images, it 
does not directly answer a specific question at 
hand, such as, “what is the diagnosis in this 
patient” or “what imaging test should I order 
next?” Answering such questions requires 
inference, either by the physician with all 
the available data, or by a computer, using 
physician inputs and the images. As the use 
of imaging proliferates and the number of 
images being produced by imaging modali- 
ties explodes, it is becoming a major challenge 
for practicing radiologists to integrate the 
multitude of imaging data, clinical data, and 
soon molecular data, to formulate an accu- 
rate diagnosis and management plan for the 
patient. Computer-based inference systems— 
specifically decision support systems—can 
help radiologists understand the biomedical 
import of this information and to provide 
guidance (Hudson and Cohen 2009). 

There are two major approaches to 
computer-based inference using images: (1) 
using quantitative image features only (quan- 
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titative imaging computer inference systems), 
and (2) use knowledge associated with the 
images (knowledge-based computer inference 
systems). 


= Quantitative Imaging Computer Inference 
Systems 
The process of deriving quantitative image 
features was described in » Sect. 10.4.5. 
Quantitative imaging applications such as 
CAD and CADx use these quantifiable fea- 
tures extracted from medical images for a vari- 
ety of decision support applications, such as 
the assessment of an abnormality to suggest 
a diagnosis, or to evaluate the severity, degree 
of change, or status of a disease, injury, or 
chronic condition. In general, the quantitative 
imaging computer reasoning systems apply a 
mathematical model (e.g., a classifier) or other 
machine learning methods to obtain a deci- 
sion output based on the imaging inputs. 
There are three types of systems that make 
inferences using quantitative imaging data, 
computer-assisted detection (CAD), computer- 
assisted diagnosis (CADx), and computerized 
prediction systems. In CAD, the computer 
locates ROIs in the image where abnormalities 
are suspected and the radiologist must evalu- 
ate their medical significance. This is generally 
accomplished using quantitative image analy- 
sis methods (» Sect. 10.4.5). In CADx, the 
computer is given an ROI corresponding to a 
suspected abnormality (possible with associ- 
ated clinical information) and it outputs the 
likely diagnoses and possibly management 
recommendations (ideally with some sort of 
confidence rating as well as explanation facil- 
ity). Ideally, the confidence of the algorithm 
in making this diagnosis is also provided as 
well as explanation or transparency to the 
user to understand how that diagnosis was 
determined from the facts. These systems gen- 
erally use both quantitative imaging methods 
(> Sect. 10.4.5) as well as computer reasoning 
methods that leverage knowledge associated 
with the image (> Sect. 10.3.2). In comput- 
erized predication, a computational model 
based on analysis of the images (potentially 
integrated with other data) makes a clinical 
prediction about the patient. 


a CAD 

In CAD applications, the goal is detection of 
abnormalities that are visible in the image, to 
scan the image and identify suspicious regions 
that may represent regions of disease in the 
patient. A common use for CAD is screening, 
the task of reviewing many images and iden- 
tifying those that are suspicious and require 
closer scrutiny by a radiologist (e.g., mam- 
mography interpretation). Most CAD appli- 
cations comprise an image processing pipeline 
(> Sect. 10.4) that uses global processing, 
segmentation, image quantitation with fea- 
ture extraction, and classification to deter- 
mine whether an image should be flagged for 
careful review by a radiologist or pathologist. 
In CAD and in screening in general, the goal 
is to detect disease; thus, the tradeoff favors 
having false positive instead of missing false 
negatives. Thus CAD systems tend to flag a 
reasonable number of normal images (false 
positives) and they miss very few abnormal 
images (false negatives). If the number of 
flagged images is small compared with the 
total number of images, then automated 
screening procedures can be economically 
viable. On the other hand, too many false 
positives are time-consuming to review and 
lessens user confidence in the CAD system; 
thus for CAD to be viable, they must mini- 
mize the number of false positives as well as 
false negatives. 

CAD techniques for screening have been 
applied successfully to many different types 
of images (Doi 2007), including mammogra- 
phy images for identifying mass lesions and 
clusters of microcalcifications, chest X-rays 
and CT of the chest to detect small cancer- 
ous nodules, and volumetric CT images of the 
colon (“virtual colonscopy”) to detect polyps. 
In addition, CAD methods have been applied 
to Papanicolaou (Pap) smears for cancerous 
or precancerous cells (Giger and MacMahon 
1996), as well as to many other types of non- 
radiologic images. 

As noted in » Sect. 10.4.5, detection 
tasks in images can be accomplished using 
pre-defined image features or using learned 
features (deep learning), which is becoming a 
very popular approach to CAD with encour- 
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aging results (Firmino et al. 2014; Shin et al. 
2016). 


= CADx 

In CADx applications, a suspicious region in 
the image has already been identified (by the 
radiologist of a CAD application), and the 
goal is to evaluate it to render a diagnosis or 
differential diagnosis. CADx systems usually 
need to be provided an ROI, or they need to 
segment the image to locate specific organs 
and lesions in order to perform analysis of 
quantitative image features that are extracted 
from the ROI and use that to render a diag- 
nosis. However, recently, CADx systems have 
begun to be developed using deep learning, 
which generally does not require any ROI, 
since the models are built using the raw image 
data (Al-Antari et al. 2018; Ishioka et al. 2018; 
Lee et al. 2018; Nishio et al. 2018). 

In general, a mathematical model is cre- 
ated to relate the quantitative (or semantic) 
features to the likely diagnoses. These mod- 
els are built either using pre-defined image 
features or using learned features (> Sect. 
10.4.5). Most of the historical CADx systems 
have been built using pre-defined image fea- 
tures (Doi 2007), but if a sufficient number 
of labeled cases is available for training, deep 
learning appears to hold much promise for 
developing CADx systems (Chen et al. 2017; 
Hosny et al. 2018). 

A particularly important emerging role for 
CADx systems is in the diagnosis of infection 
by the SARS-CoV-2 virus (COVID-19). The 
COVID-19 pandemic created extremely rapid 
and widespread person-to-person transmis- 
sion of the disease (World Health Organization 
2020). Definitive diagnosis is made using the 
reverse transcriptase polymerase chain reac- 
tion (RT-PCR) test. Since test can take up to 
2 days to complete, and given the shortage of 
RT-PCR test kits, there was an urgent need 
for alternative and rapid methods to identify 
COVID-19 patients. Imaging (radiography or 
CT) are commonly used to identify pneumo- 
nia, but imaging has not, to date, been used 
to establish a diagnosis of COVID-19 since 
imaging findings are not specific. Thus, routine 
screening CT for the identification of COVID- 
19 pneumonia is currently not recommended 
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by most radiology societies (Simpson et al. 
2020). A number of recent works have been 
undertaken to detect COVID-19 pneumonia 
in patient using deep learning (Wang and 
Wong 2020, Greenspan et al. 2020). Systems 
that integrate clinical data with images may 
be particularly promising (Mei et al. 2020). 
Nonetheless, given the accuracy of current 
CADx systems for COVID-19 likely have 
insufficient accuracy, and new diagnostic 
tests are being developed that are processed 
in a shorter time (Billingsley 2020), imaging 
will not likely have a sustained major role in 
diagnosis. Other applications for CADx sys- 
tems in COVID-19 were reviewed in (Kumar 
et al. 2020). Perhaps the most exciting role for 
computerized systems in this disease will be 
for making clinical predictions, such as sur- 
vival, need for intensive care, ventilator sup- 
port, and ultimate survival (Liang et al. 2020; 
Liu et al. 2020; Luo et al. 2020; Sperrin et al. 
2020; Wynants et al. 2020; Yang et al. 2020; 
Yuan et al. 2020). 

A limitation of using only pre-defined 
image features or unsupervised learned image 
features is that these models do not encode 
domain knowledge that may be critical to the 
accuracy of a CADx system; the presump- 
tion of using only image features is that all 
the knowledge needed for the diagnostic clas- 
sification task is represented in image data 
itself. However in some cases, it is very use- 
ful to encode knowledge in a CADx system. 
Probabilistic models provide a strategy for 
incorporating domain knowledge and have 
been shown to be effective (Burnside et al. 
2000, 2004a, b, 2006, 2007; Lee et al. 2009; 
Liu et al. 2009; Liu et al. 2011). Image features 
are generated based on the underlying disease, 
so there is probabilistic dependence on the 
disease and the quantitative and perceived 
imaging features. In fact, it can be argued that 
radiological interpretation is fundamentally 
a Bayesian task (Lusted 1960; Ledley and 
Lusted 1991; Donovan and Manning 2007) 
(see > Chaps. 3 and 22), and thus decision- 
support strategies based on Bayesian models 
may be quite effective. 

CADx can be very effective in practice, 
reducing variation and improving positive 
predictive value of radiologists (Burnside 
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O Fig. 10.22 Bayesian network-based system for deci- 
sion support in mammography CADx. The radiologist 
interpreting the image enters the radiology observations 
and clinical information (patient history) in a structured 
reporting Web-based data capture form to render the 
report. This form is sent to a server which inputs the 
observations into the Bayesian network to calculate pos- 


et al. 2006). Deploying CA Dx systems, how- 
ever, can be challenging. Since the inputs 
to CADx generally need to be structured 
(semantic features from the radiologist and/ 
or quantitative features from the image), a 
means of capturing the structured image 
information as part of the routine clinical 
workflow is required. A promising approach 
is to combine structured reporting with 
CADx (8 Fig. 10.22); the radiologist records 
the imaging observations with a data capture 
form, which provides the structured image 
content required to the CADx system. Ideally 
the output would be presented immediately 
to the radiologist as the report is generated 
so that the output of decision support can be 
incorporated into the radiology report. Such 
implementations will be greatly facilitated by 
informatics methods to extract and record 
the image information in structured and 


terior probabilities of disease. A list of diseases, ranked 
by the probability of each disease, is return to the user 
who can make a decision based on a threshold of prob- 
ability of malignancy, or based on shared decision mak- 
ing with the patient. (Figure reprinted with permission 
from (Rubin 2011). © Radiological Society of North 
America) 


standard formats and with controlled termi- 
nologies (> Sect. 10.3.2). 


= Computerized Prediction 

The goal of computerized prediction using 
images is to analyze characteristics of the 
disease manifest in the image and use that 
(without or with additional clinical data) 
to make predictions about the disease (e.g., 
life expectancy of the patient, whether or 
not the patient’s disease will respond to a 
particular treatment, or whether the disease 
will recur or progress at some time in the 
future). Many methods have been developed 
to predict such future event, using both pre- 
defined image features and unsupervised fea- 
ture learning (> Sect. 10.4.5) (Huang et al. 
2016; Jun et al. 2016; Liet al. 2016; Nie et al. 
2016; Bogowicz et al. 2017; Fave et al. 2017; 
van Timmeren et al. 2017; Wu et al. 2017; 
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Zhou et al. 20175; Betancur et al. 2018; Cha 
et al. 2018; Gastounioti et al. 2018; Shietal. 
2018). 


= Knowledge-Based Image Inference 

Systems 
The CAD and CADx systems do not require 
processing radiological knowledge (e.g., ana- 
tomic knowledge) in order to carry out their 
tasks; they are based on quantitative modeling 
of relationships of images features to diagno- 
ses. However, not all image-based reasoning 
problems are amenable to this approach. In 
particular, knowledge-based tasks such as rea- 
soning about anatomy, physiology, and pathol- 
ogy—tasks that entail symbolic manipulations 
of biomedical knowledge and application of 
logic—are best handled using different meth- 
ods, such as ontologies and logical inference 
(see > Chap. 24). 

Knowledge-based computer reasoning 
applications use knowledge representations, 
generally ontologies, in conjunction with 
rules of logic to deduce information from 
asserted facts (e.g., from observations in the 
image). For example, an anatomy ontology 
may express the knowledge that “if a seg- 
ment of a coronary artery is severed, then 
branches distal to the severed branch will not 
receive blood,” and “the anterior and lateral 
portions of the right ventricle are supplied 
by branches of the right coronary artery, 
with little or no collateral supply from the 
left coronary artery.” Using this knowledge, 
and recognition via image processing that the 
right coronary artery is severed in an injury, a 
computer reasoning application could deduce 
that the anterior and lateral portions of the 
right ventricle will become ischemic (among 
other regions; @ Fig. 10.23). In performing 
this reasoning task, the application uses the 
knowledge to draw correct conclusions by 
manipulating the anatomical concepts and 
relationships using the rules of logical infer- 
ence during the reasoning process. 

Computer reasoning with ontologies is 
performed by one of two methods: (1) rea- 
soning by ontology query and (2) reasoning by 
logical inference. In reasoning by ontology 
query, the application traverse relationships 
that link particular entities in the ontology 
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to directly answer particular questions about 
how those entities relate to each other. For 
example, by traversing the part-of relationship 
in an anatomy ontology, a reasoning applica- 
tion can infer that the left ventricle and right 
ventricle are part-of the chest (given that the 
ontology asserts they are each part of the 
heart, and that the heart is part of the chest), 
without our needing to specify this fact explic- 
itly in the ontology. 

In reasoning by logical inference, ontolo- 
gies that encode sufficient information 
(“explicit semantics”) to apply generic rea- 
soning engines are used. The Web Ontology 
Language (OWL) (Bechhofer et al. 2004; Smith 
et al. 2004; Motik et al. 2008) is an ontology 
language recommended by the World Wide 
Web Consortium (W3C) as a standard lan- 
guage for the Semantic Web (WorldWideWeb 
Consortium W3C Recommendation 10 Feb 
2004). OWL is similar to other ontology lan- 
guages in that it can capture knowledge by 
representing the entities (“classes”) and their 
attributes (“properties”). In addition, OWL 
provides the capability of defining “formal 
semantics” or meaning of the entities in the 
ontology. Entities are defined using logic state- 
ments that provide assertions about entities 
(“class axioms”) using description logics (DL) 
(Grau et al. 2008). DLs provide a formalism 
enabling developers to define precise seman- 
tics of knowledge in ontologies and to per- 
form automated deductive reasoning (Baader 
et al. 2003). For example, an anatomy ontol- 
ogy in OWL could provide precise semantics 
for “hemopericardium,” by defining it as a 
pericardial cavity that contains blood. 

Highly optimized computer reasoning 
engines (“reasoners”) have been developed for 
OWL, helping developers to incorporate rea- 
soning efficiently and effectively in their appli- 
cations (Tsarkov and Horrocks 2006; Motik 
et al. 2009). These reasoners work with OWL 
ontologies by evaluating the asserted logical 
statements about classes and their proper- 
ties in the original ontology (the “asserted 
ontology”), and they create a new ontology 
structure that is deduced from the asserted 
knowledge (the “inferred ontology”). This 
reasoning process is referred to as “auto- 
matic classification.” The inferences obtained 
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O Fig. 10.23 Knowledge-based reasoning with images 
in a task to predict the portions of the heart that will 
become ischemic after a penetrating injury that injures 
particular anatomic structures. The application allows 
the user to draw a trajectory of penetrating injury on the 
image, a 3-D rendering of the heart obtained from seg- 
mented CT images. The reasoning application automati- 
cally carries out two tasks. a The application first 
deduces the anatomic structures that will be injured con- 
sequent to the trajectory (arrow, right) by interrogating 


Trajectory of injury 
(hitting coronary artery) 


Totally ischemic 


myocardium 


a 


semantic annotations on the image based on the trajec- 
tory of injury (injured anatomic structures shown in 
bold in the left panel). b The anatomic structures that 
are predicted to be initially injured are displayed in the 
volume rendering (dark gray = total ischemia; light 
gray = partial ischemia). In this example, the right coro- 
nary artery was injured, and the reasoning application 
correctly inferred there will be total ischemia of the 
anterior and lateral wall of the right ventricle and par- 
tial ischemia of the posterior wall of the left ventricle 
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from the reasoning process are obtained by 
querying the inferred ontology and looking 
for classes (or individuals) that have been 
assigned to classes of interest in the ontol- 
ogy. For example, an application was created 
to infer the consequences of cardiac injury in 
this manner (@ Fig. 10.23). 

Several knowledge-based image reasoning 
systems have been developed that use ontolo- 
gies as the knowledge source to process the 
image content and derive inferences from 
them. These include: (1) reasoning about the 
anatomic consequences of penetrating injury, 
(2) inferring and simulating the physiologi- 
cal changes that will occur given anatomic 
abnormalities seen in images, (3) automated 
disease grading/staging to infer the grade and/ 
or stage of disease based on imaging features 
of disease in the body (4) surgical planning by 
deducing the functional significance of dis- 
ruption of white matter tracts in the brain, 
(5) inferring the types of information users 
seek based on analyzing query logs of image 
searches, and (6) inferring the response of dis- 
ease in patients to treatment based on analysis 
of serial imaging studies. We briefly describe 
these applications. 


ma Reasoning about anatomic consequences 
of penetrating injury 

In this system, images were segmented and 
semantic annotations applied to identify car- 
diac structures. An ontology of cardiac anat- 
omy in OWL was used to encode knowledge 
about anatomic structures and the portions 
of them that are supplied by different arte- 
rial branches. Using knowledge about part-of 
relationships and connectivity, the applica- 
tion uses the anatomy ontology to infer the 
anatomic consequences of injury that are rec- 
ognized on the input images (@ Fig. 10.23) 
(Rubin et al. 2004, 2005, 2006a). 


a= Inferring and simulating the physiological 
changes 

Morphological changes in anatomy have 

physiological consequences. For example, 

if a hole appears in the septum dividing 

the atria or ventricles of the heart (a sep- 

tal defect), then blood will flow abnormally 
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between the heart chambers and will pro- 
duce abnormal physiological blood flow. 
The simulation community has created 
mathematical models to predict the physi- 
ological signals, such as time-varying pres- 
sure and flow, given particular parameters 
in the model such as capacitance, resistance, 
etc. The knowledge in these mathematical 
models can be represented ontologically, in 
which the entities correspond to nodes in 
the simulation model; the advantage is that 
a graphical representation of the ontology, 
corresponding to a graphical representation 
of the mathematical model, can be created. 
Morphological alterations seen in images 
can be directly translated into alterations in 
the ontological representation of the ana- 
tomic structures, and simultaneously can 
update the simulation model appropriately 
to simulate the physiological consequences 
of the morphological anatomic alteration 
(Rubin et al. 2006b). Such knowledge-based 
image reasoning methods could greatly 
enable functional evaluation of the static 
abnormalities seen in medical imaging. 


a= Automated disease grading/staging 

A great deal of image-based knowledge is 
encoded in the literature and not readily avail- 
able to clinicians needing to apply it. A good 
example of this is the criteria used to grade 
and stage disease based on imaging crite- 
ria. For example, there are detailed criteria 
specified for staging tumors and grading the 
severity of disease. This knowledge has been 
encoded in OWL ontologies and used to 
automate grading of brain gliomas (Marquet 
et al. 2007) and staging of cancer (Dameron 
et al. 2006) based on the imaging features 
detected by radiologists. This ontology-based 
paradigm could provide a good model for 
delivering current biomedical knowledge 
to practitioners “just-in-time” to help them 
grade and stage disease as they view images 
and record their observations. 


ma Surgical planning 

Understanding complex anatomic relati- 
onships and their functional significance in 
the patient is crucial in surgical planning, par- 
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ticular for brain surgery, since there are many 
surgical approaches possible, and some will 
have less severe consequences to patients than 
others. It can be challenging to be aware of all 
these relationships and functional dependen- 
cies; thus, surgical planning is an opportune 
area to develop knowledge-based image rea- 
soning systems. The anatomic and functional 
knowledge can be encoded in an ontology and 
used by an application to plan the optimal 
surgical approach. In recent work, such an 
ontological model was developed to assess the 
functional sequelae of disruptions of motor 
pathways in the brain, which could be used 
in the future to guide surgical interventions 
(Talos et al. 2008; Rubin et al. 2009b). 


mu Inferring types of information users seek 
from images 

Knowledge-based reasoning approaches 
have been used to evaluate image search 
logs on Web sites that host image databases 
to ascertain the types of queries users sub- 
mit. RadLex (> Sect. 10.3.2) was used as the 
ontology, and by mapping the queries to leaf 
classes in RadLex and then traversing the sub- 
sumption relations, the types of queries could 
be deduced by interrogating the higher-level 
classes in RadLex (such as “visual observa- 
tion” and “anatomic entity”) (Rubin et al. 
2011). 


mu Inferring the response of disease treat- 
ment 

As mentioned above, the complex knowl- 
edge required to grade and stage disease can 
be represented using an ontology. Similarly, 
the criteria used to assess the response of 
patients to treatment is also complex, evolv- 
ing, and dependent on numerous aspects of 
image information. The knowledge needed to 
apply criteria of disease response assessment 
have been encoded ontologically, specifically 
in OWL, and used to determine automati- 
cally the degree of cancer response to treat- 
ment in patients (Levy et al. 2009, Levy and 
Rubin 2011). The inputs to the computer- 
ized reasoning method are the quantitative 
information about lesions seen in the images, 
recorded as semantic annotations using the 
AIM information model (» Sect. 10.3.2). 


This application demonstrates the poten- 
tial for a streamlined workflow of radiology 
image interpretation and lesion measurement 
automatically feeding into decision support 
to guide patient care. 


10.6 Conclusions 


This chapter focuses on methods for com- 
putational representation and for processing 
images in biomedicine, with an emphasis on 
radiological imaging and the extraction and 
characterization of anatomical structure and 
abnormalities. It has been emphasized that 
the content of images is complex—compris- 
ing both quantitative and semantic informa- 
tion. Methods of making that content explicit 
and computationally-accessible have been 
described, and they are crucial to enable com- 
puter applications to access the “biomedical 
meaning” in images; presently, the vast archives 
of images are poorly utilized because the image 
content is not explicit and accessible. As the 
methods to extract quantitative and semantic 
image information become more widespread, 
image databases will be as useful to the discov- 
ery process as the biological databases (they 
will even likely become linked), and an era of 
“data-driven” and “high-throughput imaging” 
will be enabled, analogous to modern “high- 
throughput” biology. In addition, the computa- 
tional imaging methods will lead to applications 
that leverage the image content, such as CAD/ 
CADx and knowledge-based image reasoning 
that use image content to improve physicians’ 
capability to care for patients. 

Though this chapter has focused on radi- 
ology, we stress that the biomedical imaging 
informatics methods presented are generaliz- 
able and either have been or will be applied 
to other domains in which visualization and 
imaging are becoming increasingly impor- 
tant, such as microscopy, pathology, ophthal- 
mology, and dermatology. As new imaging 
modalities increasingly become available 
for imaging other and more detailed body 
regions, the techniques presented in this chap- 
ter will increasingly be applied in all areas of 
biomedicine. For example, the development 
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of molecular imaging methods is analogous 
to functional brain imaging, since functional 
data, in this case from gene expression rather 
than cognitive activity, can be mapped to an 
anatomical substrate. 

Thus, the general biomedical imaging 
informatics methods described here will 
increasingly be applied to diverse areas of bio- 
medicine. As they are applied, and as imaging 
modalities continue to proliferate, a growing 
demand will be placed on leveraging the con- 
tent in these images to characterize the clinical 
phenotype of disease and relate it to genotype 
and clinical data from patients to enhance 
research and clinical care. 
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Image 


® Questions for Discussion 

1. How might you create an image 
processing pipeline to build an 
image-analysis program looking for 
abnormal cells in a PAP smear? How 
would you collect and incorporate 
semantic features into the program? 

2. Why is segmentation so difficult to per- 
form? Give two examples of ways by 
which current systems avoid the prob- 
lem of automatic segmentation. 

3. How might you build a decision-support 
system that is based on searching the 
hospital image archive for similar images 
and returning the diagnosis associated 
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with the most similar images? How 
might you make use of the semantic 
information in images in images to 
improve the accuracy of retrieval? 

4. What are limitations of deep learning 
compared with other machine learning 
methods for creating image interpreta- 
tion systems? When might a classical 
machine learning approach be better 
than deep learning? 

5. Give an example of how knowledge 
about the problem to be solved (e.g., 
local anatomy in the image) could be 
used in future systems to aid in auto- 
matic segmentation. 

6. Both images and free text share the 
characteristic that they are unstructured 
information; image processing methods 
to make the biomedical content in 
images explicit are very similar to 
related problems in natural language 
processing (NLP; > Chap. 8). How are 
image processing methods and NLP 
similar in terms of (1) computer 
representation of the raw content? (2) 
representation of the semantic content? 
(3) processing of the content (e.g., what 
is the NLP equivalent of segmentation, 
or the image processing equivalent of 
named entity recognition)? 


References 


Abou-Elkacem, L., Bachawal, S. V., & Willmann, J. K. 
(2015). Ultrasound molecular imaging: Moving 
toward clinical translation. European Journal of 
Radiology, 84(9), 1685-1693. 

Agrawal, M., Harwood, D., Duraiswami, R., Davis, 
L. S., & Luther, P. W. (2000). Three-dimensional 
ultrastructure from transmission electron microp- 
scope tilt series. Proceedings, Second Indian 
Conference on Vision, Graphics and Image 
Processing. Bangaore, India. 

Aine, C. J. (1995). A conceptual overview and critique of 
functional neuroimaging techniques in humans 
I. MRI/fMRI and PET. Critical Reviews in 
Neurobiology, 9, 229-309. 

Akgul, C. B., Rubin, D. L., Napel, S., Beaulieu, C. F., 
Greenspan, H., & Acar, B. (2011). Content-based 
image retrieval in radiology: Current status and 
future directions. Journal of Digital Imaging, 24(2), 
208-222. 

Al-Antari, M. A., Al-Masni, M. A., Choi, M. T., Han, 
S. M., & Kim, T. S. (2018). A fully integrated 


computer-aided diagnosis system for digital X-ray 
mammograms via deep learning detection, segmen- 
tation, and classification. International Journal of 
Medical Informatics, 117, 44-54. 

Alberini, J. L., Edeline, V., Giraudet, A. L., Champion, 
L., Paulmier, B., Madar, O., Poinsignon, A., Bellet, 
D., & Pecking, A. P. (2011). Single photon emission 
tomography/computed tomography (SPET/CT) and 
positron emission tomography/computed tomogra- 
phy (PET/CT) to image cancer. Journal of Surgical 
Oncology, 103(6), 602—606. 

André, B., Vercauteren, T., Perchant, A., Buchner, 
A. M., Wallace, M. B., & Ayache, N. (2009). 
Introducing space and time in local feature-based 
Endomicroscopic image retrieval. Medical content- 
based retrieval for clinical decision support. 
B. Caputo, H. Mller, T. Syeda-Mahmood et al. 
Berlin, Heidelberg, Springer. Lecture Notes in 
Computer Science, 5853, 18-30. 

Appel, B. (2001). Nomenclature and classification of 
lumbar disc pathology. Neuroradiology, 43(12), 
1124-1125. 

Armstrong, R. A. (2010). Review paper. Quantitative 
methods in neuropathology. Folia Neuropathologica, 
48(4), 217-230. 

Ashburner, J., & Friston, K. J. (1997). Multimodal 
image coregistration and partitioning - a unified 
framework. Neurolmage, 6(3), 209-217. 

Avni. (2009). Addressing the ImageClef 2009 challenge 
using a patch-based visual words representation 
%U.  http://www.clef-campaign.org/2009/working_ 
notes/avni-paperCLEF2009.pdf. Working notes 
CLEF2009. 

Baader, F. E., McGuinness, D. E., Nardi, D. E., 
Schneider, P. P. E., & Calvanese, D. E. (Eds.). (2003). 
The description logic handbook: Theory, implementa- 
tion and applications. Cambridge University Press: 
New York. 

Baker, J. A., Kornguth, P. J., Lo, J. Y., Williford, 
M. E., & Floyd, C. E., Jr. (1995). Breast cancer: 
Prediction with artificial neural network based on 
BI-RADS standardized lexicon. Radiology, 
196(3), 817-822. 

Baumann, B., Gotzinger, E., Pircher, M., Sattmann, H., 
Schuutze, C., Schlanitz, F., Ahlers, C., Schmidt- 
Erfurth, U., & Hitzenberger, C. K. (2010). 
Segmentation and quantification of retinal lesions 
in age-related macular degeneration using 
polarization-sensitive optical coherence tomogra- 
phy. Journal of Biomedical Optics, 15(6), 061704. 

Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, 
I., McGuinness, D. L., Patel-Schneider, P. F., & 
Stein, L. A. (2004). OWL web ontology language 
reference. Technical report REC-owl-ref-20040210, 
the WorldwideWeb consortium, Available from 
http://www.w3.org/TR/2004/REC-owl- 
ref-20040210/ 

Becich, M. J. (2000). The role of the pathologist as tissue 
refiner and data miner: The impact of functional 
genomics on the modern pathology laboratory and 
the critical roles of pathology informatics and bio- 
informatics. Molecular Diagnosis, 5(4), 287-299. 


Biomedical Imaging Informatics 


Bengio, Y., Yao, L., Alain, G., & Vincent, P. (2013). 
Generalized Denoising auto-encoders as generative 
models. In C. J. C. Burges, L. Bottou, M. Welling, 
Z. Ghahramani, & K. Q. Weinberger (Eds.), 
Advances in neural information processing systems 26 
(pp. 899-907). Curran Associates, Inc. 

Bennett, T. J., & Barry, C. J. (2009). Ophthalmic imaging 
today: An ophthalmic photographer’s viewpoint — 
A review. Clinical & Experimental Ophthalmology, 
37(1), 2-13. 

Betancur, J., Commandeur, F., Motlagh, M., Sharir, T., 
Einstein, A. J., Bokhari, S., Fish, M. B., Ruddy, 
T. D., Kaufmann, P., Sinusas, A. J., Miller, E. J., 
Bateman, T. M., Dorbala, S., Di Carli, M., 
Germano, G., Otaki, Y., Tamarappoo, B. K., Dey, 
D., Berman, D. S., & Slomka, P. J. (2018). Deep 
learning for prediction of obstructive disease from 
fast myocardial perfusion SPECT: A multicenter 
study. JACC: Cardiovascular Imaging. 

Bidgood, W. D., Jr., & Horii, S. C. (1992). Introduction 
to the ACR-NEMA DICOM standard. 
Radiographics, 12(2), 345-355. 

Billingsley, A. (2020). The Latest in Coronavirus 
(COVID-19) Testing Methods and Availability. 
Retrieved 5/17/2020, from https://www. goodrx.com/ 
blog/coronavirus-covid-19-testing-updates-meth- 
ods-cost-availability/ 

Bishop, C. M. (1995). Neural networks for pattern recog- 
nition. Oxford/New York: Clarendon Press; Oxford 
University Press. 

Biswal, S., Resnick, D. L., Hoffman, J. M., & Gambhir, 
S. S. (2007). Molecular imaging: Integration of 
molecular imaging into the musculoskeletal imaging 
practice. Radiology, 244(3), 651—671. 

Bittorf, A., Bauer, J., Simon, M., & Diepgen, T. L. 
(1997). Web-based training modules in dermatol- 
ogy. MD Computing, 14(5), 371-376. 381. 

Bloom, F. E., & Young, W. G. (1993). Brain browser. 
New York: Academic. 

Bodenreider, O. (2008). Biomedical ontologies in action: 
Role in knowledge management, data integration 
and decision support. Yearbook of Medical 
Informatics, 67-79. 

Bogowicz, M., Riesterer, O., Stark, L. S., Studer, G., 
Unkelbach, J., Guckenberger, M., & Tanadini- 
Lang, S. (2017). Comparison of PET and CT 
radiomics for prediction of local tumor control in 
head and neck squamous cell carcinoma. Acta 
Oncologica, 56(11), 1531-1536. 

Bosch, A., Munoz, X., Oliver, A., & Marti, J. (2006). 
Modeling and classifying breast tissue density in 
mammograms. Computer Vision and Pattern 
Recognition, IEEE Computer Society Conference on, 
2, 1552-1558. 

Brain Innovation B.V. (2001). BrainVoyager. From 
http://www.BrainVoyager.de/ 

Brinkley, J. F. (1993). The potential for three-dimen- 
sional ultrasound. In F. A. Chervenak, G. C. 
Isaacson, & S. Campbell (Eds.), Ultrasound in 
obstetrics and gynecology. Boston: Little, Brown 
and Company. 


349 


Brinkley, J. F., Bradley, S. W., Sundsten, J. W., & Rosse, 
C. (1997). The digital anatomist information system 
and its use in the generation and delivery of web- 
based anatomy atlases. Computers and Biomedical 
Research, 30, 472-503. 

Brinkley, J. F., Wong, B. A., Hinshaw, K. P., & Rosse, C. 
(1999). Design of an anatomy information system. 
Computer Graphics and Applications, 19(3), 38-48. 

Brown, D. B., Gould, J. E., Gervais, D. A., Goldberg, 
S. N., Murthy, R., Millward, S. F., Rilling, W. S., 
Geschwind, J. F. S., Salem, R., Vedantham, S., 
Cardella, J. F., Soulen, M. C., Techn, S. I. R., & 
Tumor, I. W. G. 1.-G. (2009). Transcatheter therapy 
for hepatic malignancy: Standardization of termi- 
nology and reporting criteria (reprinted from J Vasc 
Interv Radiol, vol 18, pg 1469-1478, 2007). Journal 
of Vascular and Interventional Radiology, 20(7), 
8425-8434. 

Burnside, E., Rubin, D., & Shachter, R. (2000). A 
Bayesian network for mammography. Proceedings 
of the AMIA Symposium, 106-110. 

Burnside, E. S., Rubin, D. L., & Shachter, R. D. (2004a). 
Using a Bayesian network to predict the probability 
and type of breast cancer represented by microcalci- 
fications on mammography. Studies in Health 
Technology and Informatics, 107(Pt 1), 13-17. 

Burnside, E. S., Rubin, D. L., Shachter, R. D., Sohlich, 
R. E., & Sickles, E. A. (2004b). A probabilistic 
expert system that provides automated 
mammographic-histologic correlation: Initial expe- 
rience. AJR. American Journal of Roentgenology, 
182(2), 481—488. 

Burnside, E. S., Rubin, D. L., Fine, J. P., Shachter, R. D., 
Sisney, G. A., & Leung, W. K. (2006). Bayesian net- 
work to predict breast cancer risk of mammo- 
graphic microcalcifications and reduce number of 
benign biopsy results: Initial experience. Radiology, 
240(3), 666-673. 

Burnside, E. S., Ochsner, J. E., Fowler, K. J., Fine, J. P., 
Salkowski, L. R., Rubin, D. L., & Sisney, G. A. 
(2007). Use of microcalcification descriptors in 
BI-RADS 4th edition to stratify risk of malignancy. 
Radiology, 242(2), 388-395. 

Burnside, E. S., Davis, J., Chhatwal, J., Alagoz, O., 
Lindstrom, M. J., Geller, B. M., Littenberg, B., 
Shaffer, K. A., Kahn, C. E., Jr., & Page, C. D. 
(2009). Probabilistic computer model developed 
from clinical data in national mammography data- 
base format to classify mammographic findings. 
Radiology, 251(3), 663-672. 

Buxton, R. B. (2009). Introduction to functional mag- 
netic resonance imaging: Principles and techniques. 
Cambridge. In UK. New York: Cambridge 
University Press. 

caBIG In-vivo Imaging Workspace. (2008). Annotation 
and Image Markup (AIM). Retrieved December 26, 
2008, from https://cabig.nci.nih.gov/tools/AIM 

Cabrera Fernandez, D., Salinas, H. M., & Puliafito, 
C. A. (2005). Automated detection of retinal layer 
structures on optical coherence tomography images. 
Optics Express, 13(25), 10200-10216. 


10 


350 D.L.Rubin et al. 


Caputo, B., Tornmasi, T., & Orabona, F. (2008). 
Discriminative cue integration for medical image 
annotation. Pattern Recognition Letters, 29(15), 
1996-2002. 

Carneiro, G., Chan, A. B., Moreno, P. J., & Vasconcelos, 
N. (2007). Supervised learning of semantic classes 
for image annotation and retrieval. JEEE 
Transactions on Pattern Analysis and Machine 
Intelligence, 29(3), 394-410. 

Carpenter, A. E., Jones, T. R., Lamprecht, M. R., 
Clarke, C., Kang, I. H., Friman, O., Guertin, D. A., 
Chang, J. H., Lindquist, R. A., Moffat, J., Golland, 
P, & Sabatini, D. M. (2006). CellProfiler: Image 
analysis software for identifying and quantifying 
cell phenotypes. Genome Biology, 7(10), R100. 

Caviness, V. S., Meyer, J., Makris, N., & Kennedy, D.N. 
(1996). MRI-based topographic parcellation of 
human neocortex: An anatomically specified 
method with estimate of reliability. Journal of 
Cognitive Neuroscience, 8(6), 566-587. 

Cha, Y. J., Jang, W. I., Kim, M. S., Yoo, H. J., Paik, 
E. K., Jeong, H. K., & Youn, S. M. (2018). Prediction 
of response to stereotactic radiosurgery for Brain 
metastases using convolutional neural networks. 
Anticancer Research, 38(9), 5437-5445. 

Chakraborty, A., Staib, L. H., & Duncan, J. S. (1996). 
Deformable boundary finding in medical images by 
integrating gradient and region information. IEEE 
Transactions on Medical Imaging, 15(6), 
859-870. 

Channin, D. S., Mongkolwat, P., Kleper, V., & Rubin, 
D. L. (2009a). Computing human image annota- 
tion. Conference Proceedings: Annual International 
Conference of the IEEE Engineering in Medicine and 
Biology Society, 1, 7065-7068. 

Channin, D. S., Mongkolwat, P., Kleper, V., Sepukar, 
K., & Rubin, D. L. (2009b). The caBIG annotation 
and image markup project. Journal of Digital 
Imaging. 

Chen, S., Liu, W., Qin, J., Chen, L., Bin, G., Zhou, Y., 
Wang, T., & Huang, B. (2017). Research progress of 
computer-aided diagnosis in cancer based on deep 
learning and medical imaging. Sheng Wu Yi Xue 
Gong Cheng Xue Za Zhi, 34(2), 314-319. 

Choi, H. S., Haynor, D. R., & Kim, Y. (1991). Partial 
volume tissue classification of multichannel mag- 
netic resonance images — a mixel model. JEEE 
Transactions on Medical Imaging, 10(3), 395-407. 

Cimino, J. J. (1996). Review paper: Coding systems in 
health care. Methods of Information in Medicine, 
35(4-5), 273-284. 

Clarysse, P., Friboulet, D., & Magnin, I. E. (1997). 
Tracking geometrical descriptors on 3-D deform- 
able surfaces: Application to the left-ventricular 
surface of the heart. IEEE Transactions on Medical 
Imaging, 16(4), 392—404. 

Cohen, J. D. (2001). FisWidgets. 2001, from http:// 
neurocog.Irdc.pitt.edu/fiswidgets/ 

Collins, D. L., Neelin, P., Peters, T. M., & Evans, A. C. 
(1994). Automatic 3-D intersubject registration of 
MR volumetric data in standardized Talairach 


space. Journal of Computer Assisted Tomography, 
18(2), 192-205. 

Collins, D. L., Holmes, D. J., Peters, T. M., & Evans, 
A. C. (1995). Automatic 3-D model-based neuro- 
anatomical segmentation. Human Brain Mapping, 3, 
190-208. 

Comaniciu, D., & Meer, P. (2002). Mean shift: A robust 
approach toward feature space analysis. ZEEE 
Transactions on Pattern Analysis and Machine 
Intelligence, 24(5), 603-619. 

Corina, D. P., Poliakov, A. V., Steury, K., Martin, R. F., 
Brinkley, J. F., Mulligan, K. A., & Ojemann, G. A. 
(2000). Correspondences between language cortex 
identified by cortical stimulation mapping and 
fMRI. Neuroimage (Human Brain Mapping Annual 
Meeting, June 12-16), 11(5), S295. 

Cox, R. W. (1996). AFNI: Software for analysis and 
visualization of functional magnetic resonance neu- 
roimages. Computers and Biomedical Research, 29, 
162-173. 

Dalal, N., & Triggs, B. (2005). Histograms of oriented 
gradients for human detection. 2005 IEEE Computer 
Society Conference on Computer Vision And Pattern 
Recognition (CVPR'05). 

Dale, A. M., Fischl, B., & Sereno, M. I. (1999). Cortical 
surface-based analysis. I. Segmentation and surface 
reconstruction. Neurolmage, 9(2), 179-194. 

Dameron, O., Roques, E., Rubin, D. L., Marquet, G., & 
Burgun, A. (2006). Grading lung tumors using 
OWL-DL based reasoning. 9th International 
Protégé Conference. Stanford, CA. 

Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2008). Image 
retrieval: Ideas, influences, and trends of the new 
age. ACM Computing Surveys, 40(2). 

Davatzikos, C., & Bryan, R. N. (1996). Using a deform- 
able surface model to obtain a shape representation 
of the cortex. JEEE Transactions on Medical 
Imaging, 15(6), 785-795. 

de Figueiredo, E. H., Borgonovi, A. F., & Doring, T. M. 
(2011). Basic concepts of MR imaging, diffusion 
MR imaging, and diffusion tensor imaging. 
Magnetic Resonance Imaging Clinics of North 
America, 19(1), 1-22. 

de Sisternes, L., Simon, N., Tibshirani, R., Leng, T., & 
Rubin, D. L. (2014). Quantitative SD-OCT imaging 
biomarkers as indicators of age-related macular 
degeneration progression. Investigative 
Ophthalmology & Visual Science, 55(11), 7093-7103. 

de Sisternes, L., Jonna, G., Greven, M. A., Chen, Q., 
Leng, T., & Rubin, D. L. (2017a). Individual Drusen 
segmentation and repeatability and reproducibility 
of their automated quantification in optical coher- 
ence tomography images. Translational Vision 
Science & Technology, 6(1), 12. 

de Sisternes, L., Jonna, G., Moss, J., Marmor, M. F., 
Leng, T., & Rubin, D. L. (20176). Automated intra- 
retinal segmentation of SD-OCT images in normal 
and age-related macular degeneration eyes. 
Biomedical Optics Express, 8(3), 1926-1949. 

Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). 
Maximum likelihood from incomplete data via the 


Biomedical Imaging Informatics 


EM algorithm. Journal of the Royal Statistical 
Society Series B, 39, 1-38. 

Deng, J., Dong, W., Socher, R., Li, L., Kai, L., & Li, 
F.-F. (2009). ImageNet: A large-scale hierarchical 
image database. 2009 IEEE Conference on Computer 
Vision and Pattern Recognition. 

Depeursinge, A., Foncubierta-Rodriguez, A., Van De 
Ville, D., & Muller, H. (2014). Three-dimensional 
solid texture analysis in biomedical imaging: Review 
and opportunities. Medical Image Analysis, 18(1), 
176-196. 

Deselaers, T., Hegerath, A., Keysers, D., & Ney, H. 
(2006). Sparse patch-histograms for object classifi- 
cation in cluttered images. In: DAGM 2006, Pattern 
Recognition, 27th DAGM Symposium, Lecture Notes 
in Computer Science, pp. 202-211. 

Deselaers, T., Muller, H., Clough, P., Ney, H., & 
Lehmann, T. M. (2007). The CLEF 2005 automatic 
medical image annotation task. International 
Journal of Computer Vision, 74(1), 51-58. 

Deserno, T. M., Antani, S., & Long, R. (2009). Ontology 
of gaps in content-based image retrieval. Journal of 
Digital Imaging, 22(2), 202-215. 

Deshpande, N., Needles, A., & Willmann, J. K. (2010). 
Molecular ultrasound imaging: Current status and 
future directions. Clinical Radiology, 65(7), 567- 
581. 

Dhenain, M., Ruffins, S. W., & Jacobs, R. E. (2001). 
Three-dimensional digital mouse atlas using high- 
resolution MRI. Developmental Biology, 232(2), 
458-470. 

DICOM Standards Committee. (2017). DICOM PS3.21 
2017e — Transformations between DICOM and 
other representations; A.6 AIM v4 to DICOM TID 
1500 mapping. From http://dicom.nema.org/medi- 
cal/Dicom/2017e/output/chtml/part21/ 
sect_A.6.html 

DICOM Standards Committee — Working Group 8 — 
Structured Reporting. (2017). Digital Imaging and 
Communications in Medicine (DICOM); sup 200 — 
Transformation of NCI annotation and image 
markup (AIM) and DICOM SR measurement tem- 
plates. From ftp://medical.nema.org/medical/dicom/ 
Supps/LB/sup200_lb_AIM_DICOMSRTID1500.pdf 

Diepgen, T. L., & Eysenbach, G. (1998). Digital images 
in dermatology and the dermatology online atlas on 
the world wide web. The Journal of Dermatology, 
25(12), 782-787. 

Doi, K. (2007). Computer-aided diagnosis in medical 
imaging: Historical review, current status and future 
potential. Computerized Medical Imaging and 
Graphics, 31(4-5), 198-211. 

Donovan, T., & Manning, D. J. (2007). The radiology 
task: Bayesian theory and perception. The British 
Journal of Radiology, 80(954), 389-391. 

D'Orsi, C. J., & Newell, M. S. (2007). BI-RADS decoded: 
Detailed guidance on potentially confusing issues. 
Radiologic Clinics of North America, 45(5), 751-763. 

Drude, N., Tienken, L., & Mottaghy, F. M. (2017). 
Theranostic and nanotheranostic probes in nuclear 
medicine. Methods, 130, 14-22. 


351 


Drury, H. A., & Van Essen, D. C. (1997). Analysis of 
functional specialization in human cerebral cortex 
using the visible man surface based atlas. Human 
Brain Mapping, 5, 233-237. 

Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern 
classification. New York: Wiley. 

Dugas-Phocion, G., Ballester, M. A. G., Malandain, G., 
Lebrun, C., & Ayache, N. (2004). Improved 
EM-based tissue segmentation and partial volume 
effect quantification in multi-sequence brain MRI. 
Medical Image Computing and Computer-Assisted 
Intervention — Miccai 2004, Pt 1, Proceedings, 3216, 
26-33. 

Durot, I., Wilson, S. R., & Willmann, J. K. (2018). 
Contrast-enhanced ultrasound of malignant liver 
lesions. Abdominal Radiology (NY), 43(4), 819-847. 

Ehteshami Bejnordi, B., Veta, M., Johannes van Diest, P., 
van Ginneken, B., Karssemeijer, N., Litjens, G., van 
der Laak, J., The, C. C., Hermsen, M., Manson, Q. F., 
Balkenhol, M., Geessink, O., Stathonikos, N., van 
Dijk, M. C., Bult, P., Beca, F., Beck, A. H., Wang, D., 
Khosla, A., Gargeya, R., Irshad, H., Zhong, A., Dou, 
Q., Li, Q., Chen, H., Lin, H. J., Heng, P. A., Hass, C., 
Bruni, E., Wong, Q., Halici, U., Oner, M. U., Cetin- 
Atalay, R., Berseth, M., Khvatkov, V., Vylegzhanin, 
A., Kraus, O., Shaban, M., Rajpoot, N., Awan, R., 
Sirinukunwattana, K., Qaiser, T., Tsang, Y. W., Tellez, 
D., Annuscheit, J., Hufnagl, P., Valkonen, M., 
Kartasalo, K., Latonen, L., Ruusuvuori, P., 
Liimatainen, K., Albarqouni, S., Mungal, B., George, 
A., Demirci, S., Navab, N., Watanabe, S., Seno, S., 
Takenaka, Y., Matsuda, H., Phoulady, H. A., Kovalev, 
V., Kalinovsky, A., Liauchuk, V., Bueno, G., 
Fernandez-Carrobles, M. M., Serrano, I., Deniz, O., 
Racoceanu, D., & Venancio, R. (2017). Diagnostic 
assessment of deep learning algorithms for detection 
of lymph node metastases in women with breast can- 
cer. JAMA, 318(22), 2199-2210. 

Endo, M., Aramaki, T., Asakura, K., Moriguchi, M., 
Akimaru, M., Osawa, A., Hisanaga, R., Moriya, Y., 
Shimura, K., Furukawa, H., & Yamaguchi, K. 
(2012). Content-based image-retrieval system in 
chest computed tomography for a solitary pulmo- 
nary nodule: Method and preliminary experiments. 
International Journal of Computer Assisted 
Radiology and Surgery, 7(2), 331-338. 

Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, 
S. M., Blau, H. M., & Thrun, S. (2017). 
Dermatologist-level classification of skin cancer with 
deep neural networks. Nature, 542(7639), 115-118. 

Eysenbach, G., Bauer, J., Sager, A., Bittorf, A., Simon, 
M., & Diepgen, T. (1998). An international derma- 
tological image atlas on the WWW: Practical use for 
undergraduate and continuing medical education, 
patient education and epidemiological research. 
Studies in Health Technology and Informatics, 52 (Pt 
2), 788-792. 

Faruque, J., Rubin, D. L., Beaulieu, C. F., & Napel, S. 
(2013). Modeling perceptual similarity measures in 
CT images of focal liver lesions. Journal of Digital 
Imaging, 26(4), 714-720. 


10 


352 D.L.Rubin et al. 


Faruque, J., Beaulieu, C. F., Rosenberg, J., Rubin, D. L., 
Yao, D., & Napel, S. (2015). Content-based image 
retrieval in radiology: Analysis of variability in 
human perception of similarity. Journal of Medical 
Imaging (Bellingham), 2(2), 025501. 

Fave, X., Zhang, L., Yang, J., Mackin, D., Balter, P., 
Gomez, D., Followill, D., Jones, A. K., Stingo, F., 
Liao, Z., Mohan, R., & Court, L. (2017). Delta- 
radiomics features for the prediction of patient out- 
comes in non-small cell lung cancer. Scientific 
Reports, 7(1), 588. 

Federative Committee on Anatomical Terminology 
(1998). Terminologia Anatomica. Stuttgart, Thieme. 

Fedorov, A. (2012). 3D Slicer Annotation Image 
Markup. Accessed 11/11/2012, from http://www.na- 
mic.org/Wiki/index.php/Projects:QIN:3D_Slicer_ 
Annotation_Image_Markup 

Fedorov, A., Beichel, R., Kalpathy-Cramer, J., Finet, J., 
Fillion-Robin, J. C., Pujol, S., Bauer, C., Jennings, 
D., Fennessy, F., Sonka, M., Buatti, J., Aylward, S., 
Miller, J. V., Pieper, S., & Kikinis, R. (2012). 3D 
Slicer as an image computing platform for the quan- 
titative imaging network. Magnetic Resonance 
Imaging, 30(9), 1323-1341. 

Fei-Fei, L., & Perona, P. (2003) A Bayesian hierarchical 
model for learning natural scene categories. Proc. of 
IEEE Computer Vision and Pattern Recognition, 
pp. 524-531. 

Ferrante, E., & Paragios, N. (2017). Slice-to-volume 
medical image registration: A survey. Medical Image 
Analysis, 39, 101-123. 

Fiala, J. C., & Harris, K. M. (2001). Extending unbiased 
stereology of brain ultrastructure to three- 
dimensional volumes. Journal of the American 
Medical Association, 8(1), 1-16. 

Figurska, M., Robaszkiewicz, J., & Wierzbowska, J. 
(2010). Optical coherence tomography in imaging 
of macular diseases. Klinika Oczna, 112(4-6), 
138-146. 

Firmino, M., Morais, A. H., Mendoca, R. M., Dantas, 
M. R., Hekis, H. R., & Valentim, R. (2014). 
Computer-aided detection system for lung cancer in 
computed tomography scans: Review and future 
prospects. Biomedical Engineering Online, 13, 41. 

FMRIDB Image Analysis Group. (2001). FSL — The 
FMRIB Software Libarary. From http://www.fmrib. 
ox.ac.uk/fsl/index.html 

Fougerousse, F., Bullen, P., Herasse, M., Lindsay, S., 
Richard, I., Wilson, D., Suel, L., Durand, M., Robson, 
S., Abitbol, M., Beckmann, J. S., & Strachan, T. (2000). 
Human-mouse differences in the embryonic expression 
of developmental control genes and disease genes. 
Human Molecular Genetics, 9(2), 165-173. 

Fox, P. T. (Ed.). (2001). Human brain mapping. 
New York: John Wiley & Sons. 

Frackowiak, R. S. J., Friston, K. J., Frith, C. D., Dolan, 
R. J., & Mazziotta, J. C. (Eds.). (1997). Human brain 
Junction. Academic Press: New York. 

Freton, A., & Finger, P. T. (2011). Spectral domain- 
optical coherence tomography analysis of choroidal 
osteoma. The British Journal of Ophthalmology. 


Friefeld, O., Greenspan, H., & Jacob, G. (2009). 
Multiple sclerosis lesion detection using constrained 
GMM and curve evolution. Journal of Biomedical 
Imaging, 2009, 1-13. 

Friston, K. J., Holmes, A. P., Worsley, K. J., Poline, J. P., 
Frith, C. D., & Frackowiak, R. S. J. (1995). Statistical 
parametric maps in functional imaging: A general 
linear approach. Human Brain Mapping, 2, 189-210. 

Gabril, M. Y., & Yousef, G. M. (2010). Informatics for 
practicing anatomical pathologists: Marking a new 
era in pathology practice. Modern Pathology, 23(3), 
349-358. 

Gastounioti, A., Oustimov, A., Hsieh, M. K., Pantalone, 
L., Conant, E. F., & Kontos, D. (2018). Using con- 
volutional neural networks for enhanced capture of 
breast parenchymal complexity patterns associated 
with breast cancer risk. Academic Radiology, 25(8), 
977-984. 

George, J. S., Aine, C. J., Mosher, J. C., Schmidt, D. M., 
Ranken, D. M., Schlitz, H. A., Wood, C. C., Lewine, 
J. D., Sanders, J. A., & Belliveau, J. W. (1995). 
Mapping function in human brain with magnetoen- 
cephalography, anatomical magnetic resonance 
imaging, and functional magnetic resonance imag- 
ing. Journal of Clinical Neurophysiology, 12(5), 406- 
431. 

Gerstner, E. R., & Sorensen, A. G. (2011). Diffusion and 
diffusion tensor imaging in brain cancer. Seminars 
in Radiation Oncology, 21(2), 141-146. 

Gevaert, O., Xu, J., Hoang, C., Leung, A., Quon, A., 
Rubin, D. L., Napel, S., & Plevritis, S. (2011). 
Integrating medical images and transcriptomic data 
in non-small cell lung cancer. AACR 102nd annual 
meeting. Orlando, FL. 

Gevaert, O., Mitchell, L. A., Xu, J., Yu, C., Rubin, D., 
Zaharchuk, G., Napel, S., & Plevritis, S. (2012a). 
Radiogenomic analysis indicates MR images are 
potentially predictive of EGFR mutation status in 
glioblastoma multiforme. AACR 103nd annual 
meeting. Chicago, IL. 

Gevaert, O., Hoang, C. D., Leung, A. N., Xu, J., Quon, 
A., Rubin, D. L., Napel, S., & Plevritis, S. (2012b). 
Non-small cell lung cancer: identifying prognostic 
imaging biomarkers by leveraging public gene 
expression microarray data—methods and prelimi- 
nary results. Radiology, 264:387-396. 
PMID:22723499. PMCID:PMC3401348. 

Giger, M. L. (2018). Machine learning in medical imag- 
ing. Journal of the American College of Radiology, 
15(3 Pt B), 512-520. 

Giger, M., & MacMahon, H. (1996). Image processing 
and computer-aided diagnosis. The Radiologic 
Clinics of North America, 34(3), 565-596. 

Gimenez, F., Xu, J., Liu, T. T., Beaulieu, C., Rubin, 
D. L., Napel, S., & Liu, Y. (2011a). Prediction of 
radiologist observations using computational image 
features: Method and preliminary results. Ninety- 
seventy annual scientific meeting of the RSNA, 
Chicago, IL. 

Gimenez, F., Xu, J., Liu, Y., Liu, T. T., Beaulieu, C., 
Rubin, D. L, & Napel, S. (2011b). On the Feasibility 


Biomedical Imaging Informatics 


of Predicting Radiological Observations from 
Computational Imaging Features of Liver Lesions 
in CT scans. First IEEE Conference on Healthcare 
Informatics, Imaging, and Systems Biology (HISB), 
IEEE Computer Society, San Jose, CA. 

Goldberg, S. N., Grassi, C. J., Cardella, J. F., 
Charboneau, J. W., Dodd, G. D., 3rd, Dupuy, D. E., 
Gervais, D. A., Gillams, A. R., Kane, R. A., Lee, 
F. T., Jr., Livraghi, T., McGahan, J., Phillips, D. A., 
Rhim, H., Silverman, S. G., Solbiati, L., Vogl, T. J., 
Wood, B. J., Vedantham, S., & Sacks, D. (2009). 
Image-guided tumor ablation: Standardization of 
terminology and reporting criteria. Journal of 
Vascular and Interventional Radiology, 20(7 Suppl), 
S377-S390. 

Gombas, P., Skepper, J. N., Krenacs, T., Molnar, B., & 
Hegyi, L. (2004). Past, present and future of digital 
pathology. Orvosi Hetilap, 145(8), 433—443. 

Gonzalez, R. C., Woods, R. E., & Eddins, S. L. (2009). 
Digital image processing using MATLAB. S.l., 
Gatesmark Publishing. 

Grau, B., Horrocks, I., Motik, B., Parsia, B., 
Patelschneider, P., & Sattler, U. (2008). Chapter 3: 
Description logics. In B. Porter, V. Lifschitz, & 
F. Van Harmelen (Eds.), Handbook of knowledge 
representation (p. 1005). Amsterdam/Boston: 
Elsevier: xxviii. 

Greenspan, H., & Pinhas, A. T. (2007). Medical image 
categorization and retrieval for PACS using the 
GMM-KL framework. IEEE Transactions on 
Information Technology in Biomedicine, 11(2), 190- 
202. 

Greenspan, H., Ruf, A., & Goldberger, J. (2006). 
Constrained Gaussian mixture model framework 
for automatic segmentation of MR brain images. 
IEEE Trans Med Imaging, 25(9): 1233-1245. 

Greenspan, H., Avni, U., Konen, E., Sharon, M., & 
Goldberger, J. (2011). X-ray categorization and 
retrieval on the organ and pathology level, using 
patch-based visual words. JEEE Transactions on 
Medical Imaging, 30(3), 733-746. 

Greenspan, H., Ginneken, B. V., & Summers, R. M. 
(2016). Guest editorial deep learning in medical 
imaging: Overview and future promise of an excit- 
ing new technique. IEEE Transactions on Medical 
Imaging, 35(5), 1153-1159. 

Greenspan, H., Estepar, R. S., Niessen, W. J., Siegel, E., 
& Nielsen, M. (2020). Position paper on COVID-19 
imaging and AI: From the clinical needs and tech- 
nological challenges to initial AI solutions at the lab 
and national level towards a new era for AI in 
healthcare. Medical Image Analysis, 66. 

Grossmann, P., Stringfield, O., El-Hachem, N., Bui, 
M. M., Rios Velazquez, E., Parmar, C., Leijenaar, 
R. T., Haibe-Kains, B., Lambin, P., Gillies, R. J., & 
Aerts, H. J. (2017). Defining the biological basis of 
radiomic phenotypes in lung cancer. eLife, 6. 

Hansell, D. M., Bankier, A. A., MacMahon, H., 
McLoud, T. C., Muller, N. L., & Remy, J. (2008). 
Fleischner society: Glossary of terms for thoracic 
imaging. Radiology, 246(3), 697-722. 


353 


Hansen, L. K., Nielsen, F. A., Toft, P., Liptrot, M. G., 
Goutte, C., Strother, S. C., Lange, N., Gade, A., 
Rottenberg, D. A., & Paulson, O. B. (1999). 
Lyngby — Modeler’ Matlab toolbox for spatio- 
temporal analysis of functional neuroimages. 
Neurolmage, 9(6), S241. 

Haralick, R. M. (1988). Mathematical morphology. 
University of Washington. 

Haralick, R. M., & Shapiro, L. G. (1992). Computer and 
robot vision. Reading: Addison-Wesley. 

Haralick, R. M., Shanmugam, K., & Dinstein, I. (1973). 
Textural features for image classification. [EEE 
Transactions on Systems, Man, and Cybernetics, 
SMC-3(6), 610-621. 

Harney, A. S., & Meade, T. J. (2010). Molecular imaging 
of in vivo gene expression. Future Medicinal 
Chemistry, 2(3), 503-519. 

Hasan, K. M., Walimuni, I. S., Abid, H., & Hahn, K. R. 
(2010). A review of diffusion tensor magnetic reso- 
nance imaging computational methods and soft- 
ware tools. Computers in Biology and Medicine. 

Heiss, W. D., & Phelps, M. E. (Eds.). (1983). Positron 
emission tomography of the brain. Berlin/New York: 
Springer. 

Held, K., Rota Kops, E., Krause, B. J., Wells, W. M., 3rd, 
Kikinis, R., & Muller-Gartner, H. W. (1997). 
Markov random field segmentation of brain MR 
images. JEEE Transactions on Medical Imaging, 
16(6), 878-886. 

Hersh, W., Muller, H., & Kalpathy-Cramer, J. (2009). 
The ImageCLEFmed medical image retrieval task 
test collection. Journal of Digital Imaging, 22(6), 
648-655. 

Hinshaw, K. P., Poliakov, A. V., Martin, R. F., Moore, 
E. B., Shapiro, L. G., & Brinkley, J. F. (2002). 
Shape-based cortical surface segmentation for visu- 
alization brain mapping. Neurolmage, 16(2), 295— 
316. 

Hoang, C., Napel, S., Gevaert, O., Xu, J., Rubin, D. L., 
Leung, A., Merritt, R., Whyte, R., Shrager, J., & 
Plevritis, S. (2011). NSCLC gene profiles correlate 
with specific CT characteristics: Image-omics. 
Philadelphia: American Association for Thoracic 
Surgery (AATS). 

Hoffman, J. M., & Gambhir, S. S. (2007). Molecular 
imaging: The vision and opportunity for radiology 
in the future. Radiology, 244(1), 39-47. 

Hohne, K., Bomans, M., Pommert, A., Riemer, M., 
Schiers, C., Tiede, U., & Wiebecke, G. (1990). 3-D 
visualization of tomographic volume data using the 
generalized voxel model. The Visual Computer, 6(1), 
28-36. 

Hohne, K. H., Bomans, M., Riemer, M., Schubert, R., 
Tiede, U., & Lierse, W. (1992). A volume-based ana- 
tomical atlas. IEEE Computer Graphics and 
Applications, 72-78. 

Hohne, K. H., Pflesser, B., Riemer, M., Schiemann, T., 
Schubert, R., & Tiede, U. (1995). A new representa- 
tion of knowledge concerning human anatomy and 
function. Nature Medicine, 1(6), 
506-510. 


10 


354 D.L.Rubin et al. 


Hoo-Chang, S., Roth, H. R., Gao, M., Lu, L., Xu, Z., 
Nogues, I., Yao, J., Mollura, D., & Summers, 
R. M. (2016). Deep convolutional neural net- 
works for computer-aided detection: CNN archi- 
tectures, dataset characteristics and transfer 
learning. IEEE Transactions on Medical Imaging, 
35(5), 1285-1298. 

Hoogi, A., Subramaniam, A., Veerapaneni, R., & 
Rubin, D. L. (2017). Adaptive estimation of active 
contour parameters using convolutional neural net- 
works and texture analysis. IEEE Transactions on 
Medical Imaging, 36(3), 781-791. 

Hosny, A., Parmar, C., Quackenbush, J., Schwartz, 
L. H., & Aerts, H. (2018). Artificial intelligence in 
radiology. Nature Reviews. Cancer, 18(8), 500-510. 

Horii SC. Image acquisition. Sites, technologies, and 
approaches. Radiol Clin North Am, 1996,34:469- 
494. PMID:8657867. 

Hu, Z., Abramoff, M. D., Kwon, Y. H., Lee, K., & 
Garvin, M. K. (2010a). Automated segmentation of 
neural canal opening and optic cup in 3D spectral 
optical coherence tomography volumes of the optic 
nerve head. Investigative Ophthalmology & Visual 
Science, 51(11), 5708-5717. 

Hu, Z., Niemeijer, M., Abramoft, M. D., Lee, K., & 
Garvin, M. K. (2010b). Automated segmentation of 
3-D spectral OCT retinal blood vessels by neural 
canal opening false positive suppression. Medical 
Image Computing and Computer-Assisted 
Intervention, 13(Pt 3), 33-40. 

Huang, Y. Q., Liang, C. H., He, L., Tian, J., Liang, C. S., 
Chen, X., Ma, Z. L, & Liu, Z. Y. (2016). 
Development and validation of a radiomics nomo- 
gram for preoperative prediction of lymph node 
metastasis in colorectal cancer. Journal of Clinical 
Oncology, 34(18), 2157-2164. 

Hudson, D. L., & Cohen, M. E. (2009). Multidimensional 
medical decision making. Conference Proceedings: 
Annual International Conference of the IEEE 
Engineering in Medicine and Biology Society, 1, 
3405-3408. 

International Anatomical Nomenclature Committee. 
(1989). Nomina Anatomica. Edinburgh: Churchill 
Livingstone. 

Ishioka, J., Matsuoka, Y., Uehara, S., Yasuda, Y., 
Kijima, T., Yoshida, S., Yokoyama, M., Saito, K., 
Kihara, K., Numao, N., Kimura, T., Kudo, K., 
Kumazawa, I., & Fujii, Y. (2018). Computer-aided 
diagnosis of prostate cancer on magnetic reso- 
nance imaging using a convolutional neural net- 
work algorithm. BJU International, 122(3), 
411—417. 

Jiang, Y.-G., Ngo, C.-W., & Yang, J. (2007). Towards opti- 
mal bag-of-features for object categorization and 
semantic video retrieval. Proceedings of the 6th ACM 
international conference on Image and video retrieval. 
Amsterdam, The Netherlands, ACM, pp. 
494-501. 

Johnson, K. A., & Becker, J. A.. (2001). The Whole 
Brain Atlas. 2001, from http://www.med.harvard. 
edu/AANLIB/home.html 


Jokerst, J. V., & Gambhir, S. S. (2011). Molecular imag- 
ing with theranostic nanoparticles. Accounts of 
Chemical Research, 44(10), 1050-1060. 

Jun, W., Xia, L., Di, D., Jiangdian, S., Min, X., Yali, Z., 
& Jie, T. (2016). Prediction of malignant and benign 
of lung tumor using a quantitative radiomic 
method. Conference Proceedings: Annual 
International Conference of the IEEE Engineering in 
Medicine and Biology Society, 2016, 1272-1275. 

Jurie, F., & Triggs, B. (2005). Creating efficient code- 
books for visual recognition. Proceedings of the 
Tenth IEEE International Conference on Computer 
Vision (ICCV'05) Volume I — Volume 01, IEEE 
Computer Society: 604-610%@ 600-7695-2334-X- 
7601. 

Kahn, C. E., Jr., Langlotz, C. P., Burnside, E. S., 
Carrino, J. A., Channin, D. S., Hovsepian, D. M., & 
Rubin, D. L. (2009). Toward best practices in radiol- 
ogy reporting. Radiology, 252(3), 852-856. 

Kahn, C. E., & Rubin, D. L. (2009). Automated seman- 
tic indexing of figure captions to improve radiology 
image retrieval. Journal of the American Medical 
Informatics Association, 16(3), 380-386. 

Kang, J. H., & Chung, J. K. (2008). Molecular-genetic 
imaging based on reporter gene expression. Journal 
of Nuclear Medicine, 49(Suppl 2), 1648-1798. 

Kapur, T., Grimson, W. E., Wells, W. M., 3rd, & Kikinis, 
R. (1996). Segmentation of brain tissue from mag- 
netic resonance images. Medical Image Analysis, 
1(2), 109-127. 

Kass, M., Witkin, A., & Terzopoulos, D. (1987). Snakes: 
Active contour models. International Journal of 
Computer Vision, 1(4), 321-331. 

Katragadda, C., Finnane, A., Soyer, H. P., Marghoob, 
A. A., Halpern, A., Malvehy, J., Kittler, H., 
Hofmann-Wellenhof, R., Da Silva, D., Abraham, I., 
Curiel-Lewandrowski, C., & G. International 
Society of Digital Imaging of the Skin -International 
Skin Imaging Collaboration. (2016). Technique 
standards for skin lesion imaging: A Delphi consen- 
sus statement. JAMA Dermatology. 

Kayalibay, B., Jensen, G., & van der Smagt, P. (2017). 
CNN-based segmentation of medical imaging data. 
arXiv:1701.03056 [cs]. 

Kennedy, D. (2001). Internet brain segmentation reposi- 
tory. 2001, from http://neuro-www.mgh.harvard. 
edu/cma/ibsr 

Kevles, B. (1997). Naked to the bone: Medical imaging in 
the twentieth century. New Brunswick: Rutgers 
University Press. 

Kimborg, D. Y., & Aguirre, G. K.. (2002). A flexible 
architecture for neuroimaging data analysis and pre- 
sentation. From http://www.nimh.nih.gov/neuroin- 
formatics/kimberg.cfm 

King, W., Proffitt, J., Morrison, L., Piper, J., Lane, D., & 
Seelig, S. (2000). The role of fluorescence in situ 
hybridization technologies in molecular diagnostics 
and disease management. Molecular Diagnosis, 5(4), 
309-319. 

Klinger, C. (2010). AIM on ClearCanvas Workstation 
Documentation. Accessed 11/11/2012, from https:// 


Biomedical Imaging Informatics 


wiki.nci.nih.gov/display/AIM/AIM+on+ClearCanv 
as+Workstation+Documentation 

Kontos, D., Summers, R. M., & Giger, M. (2017). 
Special section guest editorial: Radiomics and deep 
learning. Journal of Medical Imaging (Bellingham), 
4(4), 041301. 

Korner, M., Weber, C. H., Wirth, S., Pfeifer, K. J., Reiser, 
M. F., & Treitl, M. (2007). Advances in digital radi- 
ography: Physical principles and system overview. 
Radiographics, 27(3), 675-686. 

Koslow, S. H., & Huerta, M. F. (Eds.). (1997). 
Neuroinformatics: An overview of the human Brain 
project. Mahwah: Lawrence Erlbaum. 

Kremkau, F. W. (2006). Diagnostic ultrasound principles 
and instruments. St. Louis: Saunders Elsevier. 

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). 
ImageNet classification with deep convolutional 
neural networks. Advances in Neural Information 
Processing Systems 25, Curran Associates, Inc., 
pp. 1097-1105. 

Kulikowski, C. A. (1997). Medical imaging informatics: 


Challenges of definition and integration. Journal of 


the American Medical Informatics Association, 4(3), 
252-253. 

Kumar, V., Gu, Y., Basu, S., Berglund, A., Eschrich, 
S. A., Schabath, M. B., Forster, K., Aerts, H. J., 
Dekker, A., Fenstermacher, D., Goldgof, D. B., 
Hall, L. O., Lambin, P., Balagurunathan, Y., 
Gatenby, R. A., & Gillies, R. J. (2012). Radiomics: 
The process and the challenges. Magnetic Resonance 
Imaging, 30(9), 1234-1248. 

Kumar, A., Kim, J., Cai, W., Fulham, M., & Feng, D. 
(2013). Content-based medical image retrieval: A 
survey of applications to multidimensional and 
multimodality data. Journal of Digital Imaging, 
26(6), 1025-1039. 

Kumar, A., Gupta, P. K., & Srivastava, A. (2020). A 
review of modern technologies for tackling COVID-19 
pandemic. Diabetes and Metabolic Syndrome: Clinical 
Research and Reviews, 14(4), 569-573. 

Lambin, P., Rios-Velazquez, E., Leijenaar, R., Carvalho, 
S., van Stiphout, R. G., Granton, P., Zegers, C. M., 
Gillies, R., Boellard, R., Dekker, A., & Aerts, H. J. 
(2012). Radiomics: Extracting more information 
from medical images using advanced feature analysis. 
European Journal of Cancer, 48(4), 441-446. 

Langlotz, C. P. (2006). RadLex: A new method for 
indexing online educational materials. 
Radiographics, 26(6), 1595-1597. 

Langlotz, C. P. (2009). Structured radiology reporting: 
Are we there yet? Radiology, 253(1), 23-25. 

Larabell, C. A., & Nugent, K. A. (2010). Imaging cellu- 
lar architecture with X-rays. Current Opinion in 
Structural Biology, 20(5), 623-631. 

Le Bihan, D., Mangin, J. F., Poupon, C., Clark, C. A., 
Pappata, S., Molko, N., & Chabriat, H. (2001). 
Diffusion tensor imaging: Concepts and applica- 
tions. Journal of Magnetic Resonance Imaging, 
13(4), 534-546. 

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learn- 
ing. Nature, 521(7553), 436-444. 


355 


Ledley, R. S., & Lusted, L. B. (1991). Reasoning founda- 
tions of medical diagnosis. MD Computing, 8(5), 
300-315. 

Lee, D. H. (2003). Magnetic resonance angiography. 
Advances in Neurology, 92, 43-52. 

Lee, J. K. T. (2006). Computed body tomography with 
MRI correlation. Philadelphia: Lippincott Williams 
& Wilkins. 

Lee, Y., Kim, N., Cho, K. S., Kang, S. H., Kim, D. Y., 
Jung, Y. Y., & Kim, J. K. (2009). Bayesian classifier 
for predicting malignant renal cysts on MDCT: 
Early clinical experience. AJR. American Journal of 
Roentgenology, 193(2), W106-W111. 

Lee, J. H., Baek, J. H., Kim, J. H., Shim, W. H., Chung, 
S. R., Choi, Y. J., & Lee, J. H. (2018). Deep learning- 
based computer-aided diagnosis system for localiza- 
tion and diagnosis of metastatic lymph nodes on 
ultrasound: A pilot study. Thyroid, 28(10), 1332- 
1338. 

Lehmann, T. M., Guld, M. O., Thies, C., Fischer, B., 
Spitzer, K., Keysers, D., Ney, H., Kohnen, M., 
Schubert, H., & Wein, B. B. (2004). Content-based 
image retrieval in medical applications. Methods of 
Information in Medicine, 43(4), 354-361. 

Leong, F. J., & Leong, A. S. (2003). Digital imaging 
applications in anatomic pathology. Advances in 
Anatomic Pathology, 10(2), 88-95. 

Levy, M. A., & Rubin, D. L. (2008). Tool support to 
enable evaluation of the clinical response to treat- 
ment. American Medical Informatics Association 
Annual Symposium Proceedings, 399-403. 

Levy, M. A., & Rubin, D. L. (2011a). Computational 
approaches to assist in the evaluation of cancer 
treatment response. Imaging in Medicine, 3(2), 233- 
246. 

Levy, M. A., & Rubin, D. L. (2011b). Current and future 
trends in imaging informatics for oncology. Cancer 
Journal, 17(4), 203-210. 

Levy, M. A., O'Connor, M. J., & Rubin, D. L. (2009). 
Semantic reasoning with image annotations for 
tumor assessment. American Medical Informatics 
Association Annual Symposium Proceedings, 2009, 
359-363. 

Levy, M. A., Freymann, J. B., Kirby, J. S., Fedorov, A., 
Fennessy, F. M., Eschrich, S. A., Berglund, A. E., 
Fenstermacher, D. A., Tan, Y., Guo, X., Casavant, 
T. L., Brown, B. J., Braun, T. A., Dekker, A., 
Roelofs, E., Mountz, J. M., Boada, F., Laymon, C., 
Oborski, M., & Rubin, D. L. (2012). Informatics 
methods to enable sharing of quantitative imaging 
research data. Magnetic Resonance Imaging, 30(9), 
1249-1256. 

Lexe, G., Monaco, J., Doyle, S., Basavanhally, A., Reddy, 
A., Seiler, M., Ganesan, S., Bhanot, G., & 
Madabhushi, A. (2009). Towards improved cancer 
diagnosis and prognosis using analysis of gene 
expression data and computer aided imaging. 
Experimental Biology and Medicine (Maywood, 
N.J. ), 234(8), 860-879. 

Li, B. N., Chui, C. K., Chang, S., & Ong, S. H. (2011a). 
Integrating spatial fuzzy clustering with level set 


10 


356 D.L.Rubin etal. 


methods for automated medical image segmenta- 
tion. Computers in Biology and Medicine, 41(1), 
1-10. 

Li, C., Huang, R., Ding, Z., Gatenby, J. C., Metaxas, 
D. N., & Gore, J. C. (2011b). A level set method for 
image segmentation in the presence of intensity 
Inhomogeneities with application to MRI. JEEE 
Transactions on Image Processing, 20(7), 2007-2016. 

Li, H., Zhu, Y., Burnside, E. S., Huang, E., Drukker, K., 
Hoadley, K. A., Fan, C., Conzen, S. D., Zuley, M., 
Net, J. M., Sutton, E., Whitman, G. J., Morris, E., 
Perou, C. M., Ji, Y, & Giger, M. L. (2016). 
Quantitative MRI radiomics in the prediction of 
molecular classifications of breast cancer subtypes 
in the TCGA/TCIA data set. NPJ Breast Cancer, 2. 

Liang, W., Liang, H., Ou, L., Chen, B., Chen, A., Li, C., 
Li, Y., Guan, W., Sang, L., Lu, J., Xu, Y., Chen, G., 
Guo, H., Guo, J., Chen, Z., Zhao, Y., Li, S., Zhang, 
N., Zhong, N., He, J., & C. China Medical Treatment 
Expert Group for. (2020). Development and valida- 
tion of a clinical risk score to predict the occurrence 
of critical illness in hospitalized patients with 
COVID-19. JAMA Internal Medicine. 

Lichtenbelt, B., Crane, R., & Naqvi, S. (1998). 
Introduction to volume rendering. Prentice Hall: 
Upper Saddle River. 

Liu, Y. I., Kamaya, A., Desser, T. S., & Rubin, D. L. 
(2009). A controlled vocabulary to represent sono- 
graphic features of the thyroid and its application in 
a Bayesian network to predict thyroid nodule malig- 
nancy. Summit on Translat Bioinforma, 2009, 68-72. 

Liu, Y. I., Kamaya, A., Desser, T. S., & Rubin, D. L. 
(2011). A bayesian network for differentiating 
benign from malignant thyroid nodules using sono- 
graphic and demographic features. AJR. American 
Journal of Roentgenology, 196(5), W598-W 605. 

Liu, Z., Jin, C., Wu, C. C., Liang, T., Zhao, H., Wang, 
Y., Wang, Z., Li, F., Zhou, J., Cai, S., Zeng, L., & 
Yang, J. (2020). Association between initial chest 
CT or clinical features and clinical course in patients 
with coronavirus disease 2019 pneumonia. Korean 
Journal of Radiology, 21(6), 736-745. 

Lohmann, P., Kocher, M., Steger, J., & Galldiks, N. 
(2018). Radiomics derived from amino-acid PET 
and conventional MRI in patients with high-grade 
gliomas. The Quarterly Journal of Nuclear Medicine 
and Molecular Imaging, 62(3), 272-280. 

Lowe, D. (1999). Object recognition from local scale invari- 
ant features. Proceedings of the International Conference 
on Computer Vision, Greece, pp. 1150-1157. 

Lowe, H. J., Antipov, I., Hersh, W., & Smith, C. A. 
(1998). Towards knowledge-based retrieval of medi- 
cal images. The role of semantic indexing, image 
content representation and knowledge-based 
retrieval. Proc AMIA Symp, pp. 882-886. 

Luo, Z., Wang, N., Liu, P., Guo, Q., Ran, L., Wang, F., 
Tang, Y., & Li, Q. (2020). Association between chest 
CT features and clinical course of coronavirus dis- 
ease 2019. Respiratory Medicine, 168, 105989. 

Lusted, L. B. (1960). Logical analysis in roentgen diag- 
nosis. Radiology, 74, 178-193. 


MacDonald, D. (1993). Register, McConnel Brain 
Imaging Center, Montreal Neurological 
Institute. 

MacDonald, D., Kabani, N., Avis, D., & Evans, A. C. 
(2000). Automated 3-D extraction of inner and 
outer surfaces of cerebral cortex from MRI. 
Neurolmage, 12(3), 340-356. 

Margolis, D. J., Hoffman, J. M., Herfkens, R. J., Jeffrey, 
R. B., Quon, A., & Gambhir, S. S. (2007). Molecular 
imaging techniques in body imaging. Radiology, 
245(2), 333-356. 

Marquet, G., Dameron, O., Saikali, S., Mosser, J., & 
Burgun, A. (2007). Grading glioma tumors using 
OWL-DL and NCI thesaurus. American Medical 
Informatics Association Symposium 
Proceedings, 508-512. 

Marroquin, J. L., Vemuri, B. C., Botello, S., Calderon, 
F., & Fernandez-Bouzas, A. (2002). An accurate and 
efficient bayesian method for automatic segmenta- 
tion of brain MRI. IEEE Transactions on Medical 
Imaging, 21(8), 934-945. 

Martin, R. F., & Bowden, D. M. (2001). Primate Brain 
maps: Structure of the macaque Brain. New York: 
Elsevier Science. 

Martin, R. F., Mejino, J. L. V., Bowden, D. M., Brinkley, 
J. F, & Rosse, C. (2001). Foundational model of 
neuroanatomy: implications for the Human Brain 
Project. Proc AMIA Annu Fall Symp. Washington, 
DC, pp. 438-442. 

Marwede, D., Schulz, T., & Kahn, T. (2008). Indexing 
thoracic CT reports using a preliminary version of a 
standardized radiological lexicon (RadLex). Journal 
of Digital Imaging, 21(4), 363-370. 

Massoud, T. F., & Gambhir, S. S. (2003). Molecular 
imaging in living subjects: Seeing fundamental bio- 
logical processes in a new light. Genes and 
Development, 17, 545-580. 

McLachlan, G. J., & Peel, D. (2000). Finite mixture mod- 
els. New York: Wiley. 

Mechouche, A., Golbreich, C., Morandi, X., & Gibaud, 
B. (2008). Ontology-based annotation of brain 
MRI images. AMIA Annu Symp Proc, pp. 460-464. 

Mehta, T. S., Raza, S., & Baum, J. K. (2000). Use of 
Doppler ultrasound in the evaluation of breast car- 
cinoma. Seminars in Ultrasound, CT, and MR, 21(4), 
297-307. 

Mei, X., H.-C. Lee, K. Diao, M. Huang, B. Lin, C. Liu, 
Z. Xie, Y. Ma, P. M. Robson, M. Chung, A. Bernheim, 
V. Mani, C. Calcagno, K. Li, S. Li, H. Shan, J. Lv, 
T. Zhao, J. Xia, Q. Long, S. Steinberger, A. Jacobi, 
T. Deyer, M. Luksza, F. Liu, B. P. Little, Z. A. Fayad 
and Y. Yang (2020). Artificial intelligence-enabled 
rapid diagnosis of COVID-19 patients. medRxiv: 
2020.2004.2012.20062661. 

Milletari, F., Navab, N., & Ahmadi, S. A. (2016). V-Net: 
Fully convolutional neural networks for volumetric 
medical image segmentation. 2016 Fourth 
International Conference on 3D Vision (3DV). 

Min, J. J., & Gambhir, S. S. (2008). Molecular imaging 
of PET reporter gene expression. Handbook of 
Experimental Pharmacology, 185 Pt 2, 277-303. 


Annual 


Biomedical Imaging Informatics 


MIT Technology Review. (2013). Deep Learning. With 
massive amounts of computational power, machines 
can now recognize objects and translate speech in 
real time. Artificial intelligence is finally getting 
smart. Retrieved 10/5/2018, from https://www. 
technologyreview.com/s/513696/deep-learning/ 

Modayur, B., Prothero, J., Ojemann, G., Maravilla, K., 
& Brinkley, J. F. (1997). Visualization-based map- 
ping of language function in the brain. Neurolmage, 
6, 245-258. 

Motik, B., Grau, B. C., Horrocks, I., Parsia, B., Patel- 
Schneider, P., & Sattler, U. (2008). OWL 2: The next 
step for OWL. Journal of Web Semantics, 6(4), 
309-322. 

Motik, B., Shearer, R., & Horrocks, I. (2009). 
Hypertableau reasoning for description logics. 
Journal of Artificial Intelligence Research, 36, 
165-228. 

Muller, H., Michoux, N., Bandon, D., & Geissbuhler, A. 
(2004). A review of content-based image retrieval 
systems in medical applications-clinical benefits and 
future directions. International Journal of Medical 
Informatics, 73(1), 1-23. 

Muramatsu, C. (2018). Overview on subjective similar- 
ity of images for content-based medical image 
retrieval. Radiological Physics and Technology. 

Nagarkar, D. B., Mercan, E., Weaver, D. L., Brunye, 
T. T., Carney, P. A., Rendi, M. H., Beck, A. H., 
Frederick, P. D., Shapiro, L. G., & Elmore, J. G. 
(2016). Region of interest identification and diag- 
nostic agreement in breast pathology. Modern 
Pathology, 29(9), 1004-1011. 

Napel, S. A., Beaulieu, C. F., Rodriguez, C., Cui, J., Xu, 
J., Gupta, A., Korenblum, D., Greenspan, H., Ma, 
Y., & Rubin, D. L. (2010). Automated retrieval of 
CT images of liver lesions on the basis of image 
similarity: Method and preliminary results. 
Radiology, 256(1), 243-252. 

Napel, S., Hoang, C., Xu, J., Gevaert, O., Rubin, D. L., 
Plevritis, S., Xu, Y., Leung, A., & Quon, A. (2011). 
Computational and semantic annotation of CT and 
PET images and integration with genomic assays of 
tumors in non-small cell lung cancer (NSCLC) for 
decision support and discovery: method and pre- 
liminary results. Ninety-seventy annual scientific 
meeting of the RSNA, Chicago, IL. 

National Cancer Institute. (2012). Annotation and 
image markup on clear canvas. Retrieved Accessed 
11/7/2012, from https://wiki.nci.nih.gov/display/ 
AIM/Annotationtand+Image+Markup+-+AIM 

National Library of Medicine. (1999). Medical subject 
headings — Annotated alphabetic list. Bethesda: 
U.S. Department of Health and Human Services, 
Public Health Service. 

Neumann, H., Kiesslich, R., Wallace, M. B., & Neurath, 
M. F. (2010). Confocal laser endomicroscopy: 
Technical advances and clinical applications. 
Gastroenterology, 139(2), 388-392. 392 e381-382. 

Ng, A. Y., Jordan, M., & Weiss, Y. (2001). On spectral 
clustering: analysis and an algorithm. In: Advances in 


357 


Processing 


Systems 


Neural Information 
(NIPS 13). 

Nie, D., Zhang, H., Adeli, E., Liu, L., & Shen, D. (2016). 
3D deep learning for multi-modal imaging-guided 
survival time prediction of brain tumor patients. 
Medical Image Computing and Computer-Assisted 
Intervention, 9901, 212-220. 

Nielsen, B., Albregtsen, F., & Danielsen, H. E. (2008). 


Statistical nuclear texture analysis in cancer 
research: A review of methods and applications. 
Critical Reviews in Oncogenesis, 14(2-3), 
89-164. 


Nishio, M., Sugiyama, O., Yakami, M., Ueno, S., Kubo, 
T., Kuroda, T., & Togashi, K. (2018). Computer- 
aided diagnosis of lung nodule classification 
between benign nodule, primary lung cancer, and 
metastatic lung cancer at different image size using 
deep convolutional neural network with transfer 
learning. PLoS One, 13(7), e0200721. 

Niu, S., de Sisternes, L., Chen, Q., Leng, T., & Rubin, 
D. L. (2016). Automated geographic atrophy seg- 
mentation for SD-OCT images using region-based 
C-V model via local similarity factor. Biomedical 
Optics Express, 7(2), 581-600. 

Nowak, E., Jurie, F., & Triggs, B. (2006). Sampling strat- 
egies for bag-of-features image classification. 
Computer Vision — Eccv 2006, Pt 4, Proceedings, 
3954, 490-503. 

Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). 
Multiresolution gray-scale and rotation invariant 
texture classification with local binary patterns. 
IEEE Transactions on Pattern Analysis and Machine 
Intelligence, 24(7), 971-987. 

Organization for Human Brain Mapping (2001). Annual 
Conference on Human Brain Mapping. Brighton, 
United Kingdom. 

Paddock, S. W. (1994). To boldly glow. Applications of 
laser scanning confocal microscopy in developmen- 
tal biology. BioEssays, 16(5), 357-365. 

Panwar, N., Huang, P., Lee, J., Keane, P. A., Chuan, 
T. S., Richhariya, A., Teoh, S., Lim, T. H., & 
Agrawal, R. (2016). Fundus photography in the 21st 
century-A review of recent technological advances 
and their implications for worldwide healthcare. 
Telemedicine Journal and E-Health, 22(3), 198-208. 

Pawlus, A., Sokolowska-Dabek, D., Szymanska, K., 
Inglot, M. S., & Zaleska-Dorobisz, U. (2015). 
Ultrasound Elastography-Review of techniques 
and its clinical applications in pediatrics-Part 1. 
Advances in Clinical and Experimental Medicine, 
24(3), 537-543. 

Pelizzari, C. A. (1998). Image processing in stereotactic 
planning: Volume visualization and image 
registration. Medical Dosimetry, 23(3), 137-145. 

Perkins, G., Renken, C., Martone, M. E., Young, S. J., 
Ellisman, M., & Frey, T. (1997). Electron tomogra- 
phy of neuronal mitochondria: Three-dimensional 
structure and organization of cristae and menbrane 
contacts. Journal of Structural Biology, 119(3), 
260-272. 


10 


358 D.L.Rubin et al. 


Permuth, J. B., Choi, J., Balarunathan, Y., Kim, J., 
Chen, D. T., Chen, L., Orcutt, S., Doepker, M. P., 
Gage, K., Zhang, G., Latifi, K., Hoffe, S., Jiang, K., 
Coppola, D., Centeno, B. A., Magliocco, A., Li, Q., 
Trevino, J., Merchant, N., Gillies, R., Malafa, M., & 
Florida Pancreas, C. (2016). Combining radiomic 
features with a miRNA classifier may improve pre- 
diction of malignant pathology for pancreatic intra- 
ductal papillary mucinous neoplasms. Oncotarget, 
7(52), 85785-85797. 

Pham, D. L., Xu, C. Y., & Prince, J. L. (2000). Current 
methods in medical image segmentation. Annual 
Review of Biomedical Engineering, 2, 315. 

Pieper, S., Halle, M., & Kikinis, R. (2004). 3D SLICER. 
IEEE International Symposium on Biomedical 
Imaging ISBI, 2004, 632-635. 

Plevritis, S., Gevaert, O., Xu, J., Hoang, C., Leung, A., 
Xu, Y., Quon, A., Rubin, D. L., & Napel, S. (2011). 
Rapid Identification of Prognostic Imaging 
Biomarkers for Non-small Cell Lung Carcinoma 
(NSCLC) by integrating image features and gene 
expression and leveraging public gene expression 
databases. Ninety-seventy annual scientific meeting 
of the RSNA, Chicago, IL. 

Pouratian, N., Sheth, S. A., Martin, N. A., & Toga, 
A. W. (2003). Shedding light on brain mapping: 
Advances in human optical imaging. Trends in 
Neurosciences, 26(5), 277-282. 

Prastawa, M., Gilmore, J., Lin, W. L., & Gerig, G. 
(2004). Automatic segmentation of neonatal brain 
MRI. Medical Image Computing and Computer- 
Assisted Intervention — Miccai 2004, Pt 1, 
Proceedings, 3216, 10-17. 

Prothero, J. S., & Prothero, J. W. (1986). Three- 
dimensional reconstruction from serial sections 
IV. The reassembly problem. Computers and 
Biomedical Research, 19(4), 3610373. 

Pujara, A. C., Kim, E., Axelrod, D., & Melsaether, 
A.N. (2018). PET/MRI in breast cancer. Journal of 
Magnetic Resonance Imaging. 

Pysz, M. A., Gambhir, S. S., & Willmann, J. K. (2010). 
Molecular imaging: Current status and emerging 
strategies. Clinical Radiology, 65(7), 500-516. 

Qiu, G. (2002). Indexing chromatic and achromatic pat- 
terns for content-based colour image retrieval. 
Pattern Recognition, 35(8), 1675-1686. 

Rahmani, R., Goldman, S. A., Zhang, H., Cholleti, S. R., 
& Fritts, J. E. (2008). Localized content-based image 
retrieval. ZEEE Transactions on Pattern Analysis and 
Machine Intelligence, 30(11), 1902-1912. 

Rajpurkar, P., J. Irvin, K. Zhu, B. Yang, H. Mehta, 
T. Duan, D. Ding, A. Bagul, C. Langlotz, 
K. Shpanskaya, M. P. Lungren and A. Y. Ng (2017). 
CheXNet: Radiologist-Level Pneumonia Detection 
on Chest X-Rays with Deep Learning. CoRR 
abs/1711.05225. 

Ray, P. (2011). Multimodality molecular imaging of dis- 
ease progression in living subjects. Journal of 
Biosciences, 36(3), 499-504. 

Ray, P., & Gambhir, S. S. (2007). Noninvasive imaging 
of molecular events with bioluminescent reporter 


genes in living subjects. Methods in Molecular 
Biology, 411, 131-144. 

Rector, A. L., Nowlan, W. A., & Glowinski, A. (1993). 
Goals for concept representation in the GALEN 
project. Proceedings of the 17th Annual Symposium 
on Computer Applications in Medical Care (SCAMC 
93). C. Safran. New York: McGraw Hill, pp. 414- 
418. 

Ribaric, S., Todorovski, L., Dimec, J., & Lunder, T. 
(2001). Presentation of dermatological images on 
the internet. Computer Methods and Programs in 
Biomedicine, 65(2), 111-121. 

Ritchie, C. J., Edwards, W. S., Cyr, D. R., & Kim, Y. 
(1996). Three-dimensional ultrasonic angiography 
using power-mode Doppler. Ultrasound in Medicine 
and Biology, 22(3), 277-286. 

Robinson, P. J. (1997). Radiology’s Achilles’ heel: Error 
and variation in the interpretation of the Rontgen 
image. The British Journal of Radiology, 70(839), 
1085-1098. 

Rohlfing, T., & Maurer, C. R., Jr. (2003). Nonrigid 
image registration in shared-memory multiproces- 
sor environments with application to brains, breasts, 
and bees. JEEE Transactions on Information 
Technology in Biomedicine, 7(1), 16-25. 

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Ner: 
Convolutional networks for biomedical image seg- 
mentation. Springer International Publishing. 

Rosen, G. D., Williams, A. G., Capra, J. A., Connolly, 
M. T., Cruz, B., Lu, L., Airey, D. C., Kulkarni, K., 
& Williams, R. W. (2000). The Mouse Brain Library 
@ www.mbl.org. Int. Mouse Genome Conference 
14, p. 166. 

Ross, B., & Bluml, S. (2001). Magnetic resonance spec- 
troscopy of the human brain. Anatomical Record 
(New Anat. ), 265(2), 54-84. 

Rosse, C. (2000). Terminologia Anatomica; considered 
from the perspective of next-generation knowledge 
sources. Clinical Anatomy, 14, 120-133. 

Rosse, C., & Mejino, J. L. V. (2003). A reference ontol- 
ogy for bioinformatics: The foundational model of 
anatomy. Journal of Bioinformatics, 36(6), 478-500. 

Rosse, C., Mejino, J. L., Modayur, B. R., Jakobovits, 
R. M., Hinshaw, K. P, & Brinkley, J. F. (1998a). 
Motivation and organizational principles for ana- 
tomical knowledge representation: The digital anat- 
omist symbolic knowledge base. Journal of the 
American Medical Informatics Association, 5(1), 
17-40. 

Rosse, C., Shapiro, L. G., & Brinkley, J. F. (1998b). The 
Digital Anatomist foundational model: Principles 
for defining and structuring its concept domain. 


Proceedings, American Medical Informatics 
Association Fall Symposium. Orlando, Florida, 
pp. 820-824. 


Rubin, D. L. (2008). Creating and curating a terminol- 
ogy for radiology: Ontology modeling and analysis. 
Journal of Digital Imaging, 21(4), 355-362. 

Rubin, D. L. (2011). Measuring and improving quality 
in radiology: Meeting the challenge with informat- 
ics. Radiographics, 31(6), 1511-1527. 


Biomedical Imaging Informatics 


Rubin, D. L., & Napel, S. (2010). Imaging informatics: 
Toward capturing and processing semantic informa- 
tion in radiology images. Yearbook of Medical 
Informatics, 34-42. 

Rubin, D. L., & Snyder, A. (2011). ePAD: A cross- 
platform semantic image annotation tool ninety- 
seventh annual scientific meeting of the RSNA. 
Chicago, IL. 

Rubin, D. L., Bashir, Y., Grossman, D., Dev, P., & 
Musen, M. A. (2004). Linking ontologies with 
three-dimensional models of anatomy to predict the 
effects of penetrating injuries. Conference 
Proceedings: Annual International Conference of the 
IEEE Engineering in Medicine and Biology Society, 
5, 3128-3131. 

Rubin, D. L., Bashir, Y., Grossman, D., Dev, P., & 
Musen, M. A. (2005). Using an ontology of human 
anatomy to inform reasoning with geometric mod- 
els. Studies in Health Technology and Informatics, 
111, 429-435. 

Rubin, D. L., Dameron, O., Bashir, Y., Grossman, D., 
Dev, P., & Musen, M. A. (2006a). Using ontologies 
linked with geometric models to reason about pen- 
etrating injuries. Artificial Intelligence in Medicine, 
37(3), 167-176. 

Rubin, D. L., Grossman, D., Neal, M., Cook, D. L., 
Bassingthwaighte, J. B., & Musen, M. A. (2006b). 
Ontology-based representation of simulation mod- 
els of physiology. AMIA Annu Symp Proc, pp. 664- 
668. 

Rubin, D. L., Mongkolwat, P., Kleper, V., Supekar, K., 
& Channin, D. S. (2008a). Medical imaging on the 
semantic web: Annotation and image markup. 2008 
AAAI Spring Symposium Series, Semantic Scientific 
Knowledge Integration. Stanford University. 

Rubin, D. L., Rodriguez, C., Shah, P, & Beaulieu, C. 
(2008b). iPad: Semantic annotation and markup of 
radiological images. American Medical Informatics 
Association Annual Symposium Proceedings, 626—630. 

Rubin, D. L., Mongkolwat, P., & Channin, D. S. (2009a). 
A semantic image annotation model to enable inte- 
grative translational research. Summit on Translat 
Bioinforma, 2009, 106-110. 

Rubin, D. L., Talos, I. F., Halle, M., Musen, M. A., & 
Kikinis, R. (2009b). Computational neuroanatomy: 
Ontology-based representation of neural compo- 
nents and connectivity. BMC Bioinformatics, 
10(Suppl 2), S3. 

Rubin, D. L., Korenblum, D., Yeluri, V., Frederick, P., & 
Herfkens, R. J. (2010). Semantic annotation and 
image markup in a commercial PACS workstation. 
Scientific Paper, Ninety-sixth annual scientific meet- 
ing of the RSNA. Chicago, IL. 

Rubin, D. L., Flanders, A., Kim, W., Siddiqui, K. M., & 
Kahn, C. E., Jr. (2011). Ontology-assisted analysis of 
web queries to determine the knowledge radiologists 
seek. Journal of Digital Imaging, 24(1), 160-164. 

Ruiz, M. E. (2006). Combining image features, case 
descriptions and UMLS concepts to improve 
retrieval of medical images. American Medical 


359 


Informatics Association Annual 
Proceedings, 674-678. 

Sandor, S., & Leahy, R. (1997). Surface-based labeling 
of cortical anatomy using a deformable atlas. IEEE 
Transactions on Medical Imaging, 16(1), 41-54. 

Schaltenbrand, G., & Warren, W. (1977). Atlas for 
Stereotaxy of the human Brain. Stuttgart: Thieme. 

Scheckenbach, K., Colter, L., & Wagenmann, M. (2017). 
Radiomics in head and neck cancer: Extracting 
valuable information from data beyond recognition. 
ORL: Journal for Otorhinolaryngology and Its 
Related Specialties, 79(1-2), 65-71. 

Schimel, A. M., Fisher, Y. L., & Flynn, H. W., Jr. (2011). 
Optical coherence tomography in the diagnosis and 
management of diabetic macular edema: Time- 
domain versus spectral-domain. Ophthalmic 
Surgery, Lasers & Imaging, 42(4), S41-855. 

Schultz, E. B., Price, C., & Brown, P. J. B. (1997). 
Symbolic anatomic knowledge representation in the 
read codes version 3: Structure and application. 
Journal of the American Medical Informatics 
Association, 4, 38-48. 

Schwartz, L. H., Panicek, D. M., Berk, A. R., Li, Y., & 
Hricak, H. (2011). Improving communication of 
diagnostic radiology findings through structured 
reporting. Radiology, 260(1), 174-181. 

Seidenari, S., Pellacani, G, & Grana, C. (2003). 
Computer description of colours in dermoscopic 
melanocytic lesion images reproducing clinical 
assessment. The British Journal of Dermatology, 
149(3), 523-529. 

Sensor Systems Inc. (2001). "MedEx." From http:// 
medx.sensor.com/products/medx/index.html 

Shao, L., Zhang, H., & de Haan, G. (2008). An overview 
and performance evaluation of classification-based 
least squares trained filters. JEEE transactions on 
image processing: a publication of the IEEE Signal 
Processing Society, 17(10), 1772-1782. 

Shapiro, L. G., & Stockman, G. C. (2001). Computer 
vision. Prentice Hall: Upper Saddle River. 

Shattuck, D. W., & Leahy, R. M. (2001). Automated 
graph-based analysis and correction of cortical vol- 
ume topology. IEEE Transactions on Medical 
Imaging, 20(11), 1167-1177. 

Shen, D., Wu, G., & Suk, H. I. (2017). Deep learning in 
medical image analysis. Annual Review of Biomedical 
Engineering, 19, 221-248. 

Shi, J. B., & Malik, J. (2000). Normalized cuts and image 
segmentation. JEEE Transactions on Pattern 
Analysis and Machine Intelligence, 22(8), 888-905. 

Shi, B., Grimm, L. J., Mazurowski, M. A., Baker, J. A., 
Marks, J. R., King, L. M., Maley, C. C., Hwang, 
E. S., & Lo, J. Y. (2018). Prediction of occult inva- 
sive disease in ductal carcinoma in situ using deep 
learning features. Journal of the American College of 
Radiology, 15(3 Pt B), 527-534. 

Shin, H. C., Roth, H. R., Gao, M., Lu, L., Xu, Z., 
Nogues, I., Yao, J., Mollura, D., & Summers, R. M. 
(2016). Deep convolutional neural networks for 
computer-aided detection: CNN architectures, 


Symposium 


10 


360 D.L.Rubin et al. 


dataset characteristics and transfer learning. JEEE 
Transactions on Medical Imaging, 35(5), 1285-1298. 

Simpson, S., Kay, F. U., Abbara, S., Bhalla, S., Chung, 
J. H., Chung, M., Henry, T. S., Kanne, J. P., 
Kligerman, S., Ko, J. P, & Litt, H. (2020). 
Radiological Society of North America Expert 
Consensus Statement on Reporting Chest CT 
Findings Related to COVID-19. Endorsed by the 
Society of Thoracic Radiology, the American 
College of Radiology, and RSNA. Journal of 
Thoracic Imaging. 

Singh, A., Massoud, T. F., Deroose, C., & Gambhir, S. S. 
(2008). Molecular imaging of reporter gene expres- 
sion in prostate cancer: An overview. Seminars in 
Nuclear Medicine, 38(1), 9-19. 

Sivic, J., & Zisserman, A. (2003). Video Google: A text 
retrieval approach to object matching in videos. 
Proceedings of the International Conference on 
Computer Vision., 2, 1470-1477. 

Smeulders, A. W. M., Worring, M., Santini, S., Gupta, 
A., & Jain, R. (2000). Content-based image retrieval 
at the end of the early years. IEEE Transactions on 
Pattern Analysis and Machine Intelligence, 22(12), 
1349-1380. 

Smith, M. K., Welty, C., & McGuinness, D. (2004). 
OWL Web Ontology Language Guide. http://www. 
w3.org/TR/owl-guide/ 

Smith, M. Q., Staley, C. A., Kooby, D. A., Styblo, T., 
Wood, W. C., & Yang, L. (2009). Multiplexed fluo- 
rescence imaging of tumor biomarkers in gene 
expression and protein levels for personalized and 
predictive medicine. Current Molecular Medicine, 
9(8), 1017-1023. 

Sohrab, M. A., Smith, R. T., & Fawzi, A. A. (2011). 
Imaging characteristics of dry age-related macular 
degeneration. Seminars in Ophthalmology, 26(3), 
156-166. 

Solomon, M., Liu, Y., Berezin, M. Y., & Achilefu, S. 
(2011). Optical imaging in cancer research: Basic 
principles, tumor detection, and therapeutic monitor- 
ing. Medical Principles and Practice, 20(5), 397—415. 

Soto, G. E., Young, S. J., MArtone, M. E., Deerinick, 
T. J., Lamont, S. L., Carragher, B. O., Hamma, K., & 
Ellisman, M. H. (1994). Serial section electron 
tomography: A method for three-dimensional recon- 
struction of large structures. Neurolmage, 1, 230-243. 

Spackman, K. A., Campbell, K. E., & Cote, R. A. 
(1997). SNOMED-RT: A reference terminology for 
health care. Proceedings, AMIA Annual Fall 
Symposium. D. R. Masys. Philadelphia, Hanley and 
Belfus, pp. 640—644. 

Sperrin, M., Grant, S. W., & Peek, N. (2020). Prediction 
models for diagnosis and prognosis in Covid-19. 
BMJ, 369, m1464. 

Spitzer, V. M., & Whitlock, D. G. (1998). The visible 
human dataset: The anatomical platform for human 
simulation. The Anatomical Record, 253(2), 
49-57. 

Stensaas, S. S., & Millhouse, O. E.. (2001). Atlases of the 
Brain. From _ http://medstat.med.utah.edu/kw/ 
brain_atlas/credits.htm 


Subramaniam, B., Hennessey, J. G., Rubin, M. A., 
Beach, L. S., & Reiss, A. L. (1997). Software and 
methods for quantitative imaging in neuroscience: 
the Kennedy Krieger Institute Human Brain 
Project. In S. H. Koslow & M. F. Huerta (Eds.), 
Neuroinformatics: an overview of the Human Brain 
Project (pp. 335-360). Mahwah: Lawrence Erlbaum. 

Sundsten, J. W., Conley, D. M., Ratiu, P., Mulligan, 
K. A., & Rosse, C.. (2000). Digital Anatomist web- 
based interactive atlases. From http://www9.biostr. 
washington.edu/da.html 

Swanson, L. W. (1999). Brain maps: Structure of the rat 
brain. New York: Elsevier Science. 

Talairach, J., & Tournoux, P. (1988). Co-planar stereo- 
taxic atlas of the human brain. New York: Thieme 
Medical Publishers. 

Talos, I. F, Rubin, D. L., Halle, M., Musen, M., & 
Kikinis, R. (2008). A prototype symbolic model of 
canonical functional neuroanatomy of the motor 
system. Journal of Biomedical Informatics, 41(2), 
251-263. 

Toga, A. W. (2001). UCLA Laboratory for Neuro 
Imaging (LONI). From http://www.loni.ucla.edu/ 

Toga, A. W., Ambach, K. L., & Schluender, S. (1994). 
High-resolution anatomy from in situ human brain. 
Neurolmage, 1(4), 334-344. 

Toga, A. W., Santori, E. M., Hazani, R., & Ambach, K. 
(1995). A 3-D digital map of rat brain. Brain 
Research Bulletin, 38(1), 77—85. 

Toga, A. W., Frackowiak, R. S. J., & Mazziotta, J. C. 
(Eds.). (2001). Neuroimage: A journal of brain func- 
tion. Academic Press: New York. 

Tommasi, T., Caputo, B., Welter, P., Güld, M. O., & 
Deserno, T. M. (2010). Overview of the CLEF 2009 
medical image annotation track. Proceedings of the 
10th international conference on Cross-language eval- 
uation forum: multimedia experiments. Corfu, 
Greece, Springer-Verlag, pp. 85-93. 

Toomre, D., & Bewersdorf, J. (2010). A new wave of cel- 
lular imaging. Annual Review of Cell and 
Developmental Biology, 26, 285-314. 

Trebeschi, S., Griethuysen, J. J. M. V., Lambregts, D. M. 
J., Lahaye, M. J., Parmar, C., Bakers, F. C. H., 
Peters, N. H. G. M., Beets-Tan, R. G. H., & Aerts, 
H. J. W. L. (2017). Deep learning for fully-automated 
localization and segmentation of rectal Cancer on 
multiparametric MR. Scientific Reports, 7(1), 5301. 

Tsarkov, D., & Horrocks, I. (2006). FaCT++ description 
logic reasoner: System description. Automated 
Reasoning, Proceedings, 4130, 292-297. 

Valdora, F., Houssami, N., Rossi, F., Calabrese, M., & 
Tagliafico, A. S. (2018). Rapid review: Radiomics 
and breast cancer. Breast Cancer Research and 
Treatment, 169(2), 217-229. 

Van Essen, D. C., & Drury, H. A. (1997). Structural and 
functional analysis of human cerebral cortex using 
a surface-basec atlas. The Journal of Neuroscience, 
17(18), 7079-7102. 

Van Essen, D. C., Drury, H. A., Dickson, J., Harwell, J., 
Hanlon, D., & Anderson, C. H. (2001). An inte- 
grated software suite for surface-based analysis of 


Biomedical Imaging Informatics 


cerebral cortex. Journal of the American Medical 
Association, 8(5), 443-459. 

Van Leemput, K., Maes, F., Vandermeulen, D., & 
Suetens, P. (1999). Automated model-based tissue 
classification of MR images of the brain. JEEE 
Transactions on Medical Imaging, 18(10), 897—908. 

Van Noorden, S. (2002). Advances in immunocytochem- 
istry. Folia Histochemica et Cytobiologica, 40(2), 
121-124. 

van Timmeren, J. E., Leijenaar, R. T. H., van Elmpt, W., 
Reymen, B., Oberije, C., Monshouwer, R., Bussink, 
J., Brink, C., Hansen, O., & Lambin, P. (2017). 
Survival prediction of non-small cell lung cancer 
patients using radiomics analyses of cone-beam CT 
images. Radiotherapy and Oncology, 123(3), 
363-369. 

Vapnik, V. N. (2000). The nature of statistical learning 
theory. New York: Springer. 

Varma, M., & Zisserman, A. (2003). Texture classifica- 
tion: are filter banks necessary?, in. Proceedings of 
International Conference on Computer Vision and 
Pattern Recognition, pp. 91—698. 

Venkatesh, S. K., & Ehman, R. L. (2015). Magnetic 
resonance elastography of abdomen. Abdominal 
Imaging, 40(4), 745-759. 

Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & 
Manzagol, P.-A. (2010). Stacked Denoising autoen- 
coders: Learning useful representations in a deep 
network with a local Denoising criterion. Journal of 
Machine Learning Research, 11(Dec), 3371-3408. 

Vreeman, D. J., & McDonald, C. J. (2005). Automated 
mapping of local radiology terms to LOINC. AMIA 
Annu Symp Proc, pp. 769-773. 

Vreeman, D. J., Abhyankar, S., Wang, K. C., Carr, C., 
Collins, B., Rubin, D. L., & Langlotz, C. P. (2018). 
The LOINC RSNA radiology playbook — A unified 
terminology for radiology procedures. Journal of the 
American Medical Informatics Association, 25(7), 
885-893. 

Wang, L., & Wong, A.. (2020). COVID-Net: A tailored 
deep convolutional neural network design for detec- 
tion of COVID-19 cases from chest X-ray images. 

Wang, J. Z., Wiederhold, G., Firschein, O., & Wei, S. X. 
(1997). Content-based image indexing and search- 
ing using Daubechies’ wavelets. International 
Journal on Digital Libraries, 1(4), 311-328. 

Wang, D., A. Khosla, R. Gargeya, H. Irshad and A. H. 
Beck (2016). Deep learning for identifying meta- 
static breast cancer. arXiv:1606.05718 [cs, q-bio]. 

Wang, K. C., Patel, J. B., Vyas, B., Toland, M., Collins, 
B., Vreeman, D. J., Abhyankar, S., Siegel, E. L., 
Rubin, D. L., & Langlotz, C. P. (2017). Use of radi- 
ology procedure codes in health care: The need for 
standardization and structure. Radiographics, 37(4), 
1099-1110. 

Weaver, O., & Leung, J. W. T. (2018). Biomarkers and 
imaging of breast cancer. AJR. American Journal of 
Roentgenology, 210(2), 271-278. 

Weiss, D. L., & Langlotz, C. P. (2008). Structured report- 
ing: Patient care enhancement or productivity 
nightmare? Radiology, 249(3), 739-747. 


361 


Weissleder, R., & Mahmood, U. (2001). Molecular 
imaging. Radiology, 219, 316-333. 

Weissleder, R., Schwaiger, M. C., Gambhir, S. S., & 
Hricak, H. (2016). Imaging approaches to optimize 
molecular therapies. Science Translational Medicine, 
8(355), 355ps316. 

Wessels, J. T., Yamauchi, K., Hoffman, R. M., & 
Wouters, F. S. (2010). Advances in cellular, subcel- 
lular, and nanoscale imaging in vitro and in vivo. 
Cytometry. Part A, 77(7), 667-676. 

Willmann, J. K., van Bruggen, N., Dinkelborg, L. M., & 
Gambhir, S. S. (2008). Molecular imaging in drug 
development. Nature Reviews. Drug Discovery, 7(7), 
591-607. 

Wilson, T. (1990). Confocal Microscopy. San Diego: 
Academic Press Ltd. 

Wong, B. A., Rosse, C., & Brinkley, J. F. (1999). Semi- 
automatic scene generation using the Digital 
Anatomist Foundational Model. Proceedings, 
American Medical Informatics Association Fall 
Symposium. Washington, D.C., pp. 637-641. 

Woods, R. P., Cherry, S. R., & Mazziotta, J. C. (1992). 
Rapid automated algorithm for aligning and reslic- 
ing PET images. Journal of Computer Assisted 
Tomography, 16, 620-633. 

Woods, R. P., Mazziotta, J. C., & Cherry, S. R. (1993). 
MRI-PET registration with automated algorithm. 
Journal of Computer Assisted Tomography, 17, 
536-546. 

World Health Organization. (2020.) WHO Director- 
General's opening remarks at the media briefing on 
COVID-19 — 11 March 2020. Retrieved 5/15/2020, 
from https://www.who.int/dg/speeches/detail/who- 
director-general-s-opening-remarks-at-the-media- 
briefing-on-covid-19%2D%2D-11-march-2020 

WorldWideWeb Consortium. (W3C Recommendation 
10 Feb 2004). OWLWeb Ontology Language 
Reference. 

Wu, S., Zheng, J., Li, Y., Yu, H., Shi, S., Xie, W., Liu, H., 
Su, Y., Huang, J., & Lin, T. (2017). A Radiomics 
nomogram for the preoperative prediction of lymph 
node metastasis in bladder Cancer. Clinical Cancer 
Research, 23(22), 6904-6911. 

Wynants, L., Van Calster, B., Bonten, M. M. J., Collins, 
G. S., Debray, T. P. A., De Vos, M., Haller, M. C., 
Heinze, G., Moons, K. G. M., Riley, R. D., Schuit, 
E., Smits, L. J. M., Snell, K. I. E., Steyerberg, E. W., 
Wallisch, C., & van Smeden, M. (2020). Prediction 
models for diagnosis and prognosis of covid-19 
infection: Systematic review and critical appraisal. 
BMJ, 369, m1328. 

Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. 
(2018). Convolutional neural networks: an overview 
and application in radiology. Insights Imaging, 9(4), 
611-629. 

Yang, L., Jin, R., Mummert, L., Sukthankar, R., Goode, 
A., Zheng, B., Hoi, S. C. H., & Satyanarayanan, M. 
(2010). A boosting framework for visuality-preserving 
distance metric learning and its application to medi- 
cal image retrieval. JEEE Transactions on Pattern 
Analysis and Machine Intelligence, 32(1), 30-44. 


10 


362 D.L.Rubin et al. 


Yang, Z., Shi, J., He, Z., Lu, Y., Xu, Q., Ye, C., Chen, S., 
Tang, B., Yin, K., Lu, Y., & Chen, X. (2020). 
Predictors for imaging progression on chest CT 
from coronavirus disease 2019 (COVID-19) patients. 
Aging (Albany NY), 12(7), 6037-6048. 

Yasaka, K., Akai, H., Kunimatsu, A., Kiryu, S., & Abe, 
O. (2018). Deep learning with convolutional neural 
network in radiology. Japanese Journal of Radiology, 
36(4), 257-272. 

Yoo, T. S. (2004). Insight into images : Principles and 
practice for segmentation, registration, and image 
analysis. Wellesley: A K Peters. 

Yu, F., & Ip, H. H. (2008). Semantic content analysis 
and annotation of histological images. Computers in 
Biology and Medicine, 38(6), 635-649. 

Yuan, M., Yin, W., Tao, Z., Tan, W., & Hu, Y. (2020). 
Association of radiologic findings with mortality of 
patients infected with 2019 novel coronavirus in 
Wuhan, China. PLoS One, 15(3), e0230548. 

Zaleska-Dorobisz, U., Pawlus, A., Szymanska, K., 
Lasecki, M., & Ziajkiewicz, M. (2015). Ultrasound 
Elastography-Review of techniques and its clinical 
applications in pediatrics-Part 2. Advances in 
Clinical and Experimental Medicine, 24(4), 
725-730. 

Zalis, M. E., Barish, M. A., Choi, J. R., Dachman, 
A. H., Fenlon, H. M., Ferrucci, J. T., Glick, S. N., 
Laghi, A., Macari, M., McFarland, E. G., Morrin, 
M. M., Pickhardt, P. J., Soto, J., & Yee, J. (2005). CT 
colonography reporting and data system: A consen- 
sus proposal. Radiology, 236(1), 3-9. 

Zhang, Y. Y., Brady, M., & Smith, S. (2001). 
Segmentation of brain MR images through a hid- 
den Markov random field model and the 
expectation-maximization algorithm. IEEE 
Transactions on Medical Imaging, 20(1), 45-57. 


Zhang, H., Ingham, E. S., Gagnon, M. K., Mahakian, 
L. M., Liu, J., Foiret, J. L., Willmann, J. K., & 
Ferrara, K. W. (2017). In vitro characterization and 
in vivo ultrasound molecular imaging of nucleolin- 
targeted microbubbles. Biomaterials, 118, 
63-73. 

Zhenyu, H., Yanjie, Z., Tonghui, L., & Jianguo, Z. 
(2009). Combining text retrieval and content-based 
image retrieval for searching a large-scale medical 
image database in an integrated RIS/PACS environ- 
ment, SPIE. 

Zhou, S. K., Greenspan, H., & Shen, D. (2017a). Deep 
learning for medical image analysis. London/San 
Diego: Elsevier/Academic Press. 

Zhou, Y., He, L., Huang, Y., Chen, S., Wu, P., Ye, W., Liu, 
Z., & Liang, C. (2017b). CT-based radiomics signa- 
ture: A potential biomarker for preoperative predic- 
tion of early recurrence in hepatocellular carcinoma. 
Abdominal Radiology (NY), 42(6), 1695-1704. 

Zhou, M., Scott, J., Chaudhury, B., Hall, L., Goldgof, 
D., Yeom, K. W., Iv, M., Ou, Y., Kalpathy-Cramer, 
J., Napel, S., Gillies, R., Gevaert, O., & Gatenby, R. 
(2018). Radiomics in brain tumor: Image assess- 
ment, quantitative feature descriptors, and machine- 
learning approaches. AJNR. American Journal of 
Neuroradiology, 39(2), 208-216. 

Zijdenbos, A. P., Evans, A. C., Riahi, F., Sled, J., Chui, 
J., & Kollokian, V. (1996). Automatic quantification 
of multiple sclerosis lesion volume using stereotac- 
tic space. Proc. 4th Int. Conf. on Visualization in 
Biomedical Computing. Hamburg. pp. 439—448. 

Zimmerman, S. L., Kim, W., & Boonn, W. W. (2011). 
Informatics in radiology: Automated struc- 
tured reporting of imaging findings using the 
AIM standard and XML. Radiographics, 31(3), 
881-887. 


Personal Health Informatics 
Robert M. Cronin, Holly Jimison, and Kevin B. Johnson 


Contents 


11.1 Introduction - 365 


11.2 _ Patient-Centered Care and Personal Health 
Informatics - 365 

11.2.1 Using Biomedical Informatics to Impact 
Patient-Centered Medicine - 366 

11.2.2 Limitations of Patient-Centered Care - 368 


11.3 Historical Perspective of Personal Health 
Informatics - 368 

11.3.1 Paternalism and Professionalism of Medicine 
and Informatics — 368 

11.3.2 The Rise of Patient-Centered Medicine and Personal 
Health Informatics - 369 


11.4 Important Concepts in Personal 
Health Informatics - 373 

11.4.1 Health Literacy and Numeracy - 374 

11.4.2 Digital Divide - 374 

11.4.3 Chronic Conditions - 374 

11.4.4 Conditions Associated with Aging - 375 

11.4.5 Behavior Management - 375 


11.5 The Impact of Personal Health Informatics 
on Biomedical Informatics - 375 

11.5.1 Data Science - 376 

11.5.2 Precision Medicine - 376 

11.5.3 Ethical, Legal and Social Issues - 377 

11.5.4 Communication - 378 

11.5.5 Mobile Health Care (mHealth) - 378 

11.5.6 Social Network Systems - 379 


© Springer Nature Switzerland AG 2021 
E. H. Shortliffe, J. J. Cimino (eds.), Biomedical Informatics, https://doi.org/10.1007/978-3-030-58721-5_11 


11:57 
11.5.8 
1349 


11.6 
11.6.1 
1102 


Application Example: EHR Portals - 380 

Application Example: Personal Health Records - 380 
Application Example: Sensors for Home Monitoring 
and Tailored Health Interventions - 381 


Future Opportunities and Challenges - 383 
Reimbursement and Business Models - 383 
Opportunities for Innovation - 383 


References - 385 


Personal Health Informatics 


© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

1. What is the role of the patient or con- 
sumer in healthcare decisions? 

2. How does patient empowerment come 
into play in the various care delivery 
settings and phases of health care? 

3. What are some examples of sensors that 
can be used to assist in personal health 
management? 

4. How can you ensure patient privacy 
and the security of patient generated 
data in the home and environment? 

5. What are the various features of per- 
sonal health technologies (e.g., personal 
health records, mobile applications, 
etc.)? 

6. How do individuals obtain various 
types of health information? 


11.1 Introduction 

Complexity and collaboration characterize 
health care in the early twenty-first century. 
Complexity arises from our deeper and more 
sophisticated understanding of health and 
disease, including the addition of molecular/ 
genomic processes and social/behavioral 
determinants. Complexity also arises from the 
myriad of new treatments available for many 
diseases, and emerging data about the role of 
nutrition, exercise, sleep, and stress in preserv- 
ing health. Collaboration begins with the real- 
ization that successful attainment of optimal 
wellbeing and effective management of dis- 
ease processes necessitate active engagement 
of clinicians, laypersons, support systems, and 
society as a whole. Collaboration extends 
beyond the societally-focused opportunities 
into the healthcare system itself, where care is 
more fragmented, leading to greater needs for 
collaboration and communication. 

Now more than ever before, the healthcare 
system recognizes the role of the person who 
interacts with this system, who is increasingly 
interested in engaging or called upon to 
engage through various states of health and 
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disease. This recognition has given rise to a 
movement that underpins the birth of per- 
sonal health informatics, as described below. 


11.2 Patient-Centered Care 
and Personal Health 
Informatics 


Patient-centered care has become a core com- 
ponent of medical care. From early work in 
the 1960s and 70s (Balint 1969; Waitzkin and 
Stoeckle 1972), to the concept of the chronic 
disease model (Bodenheimer et al., 2002c; 
Coleman et al. 2009), the National Academy 
of Medicine landmark report Crossing the 
Quality Chasm (Institute of Medicine (US) 
Committee on Quality of Health Care in 
America 2001) making “patient-centered” 
one of the six aims of health care, and the 
development and incorporation of the medi- 
cal home (Kellerman and Kirk 2007) and 
patient engagement (Dentzer 2013),! patient- 
centered care has taken center stage in 
medicine. 

As a result of this visibility, healthcare 
institutions, health planners, congressional 
representatives, and hospital public relations 
departments are among many promoters of 
patient-centered care, a concept rooted in 
“deep respect for patients as unique living 
beings, and the obligation to care for them on 
their terms”(Epstein and Street Jr 2011). To 
be patient-centered, one must accept people 
seeking care as persons with a unique social 
world, who should be listened to, informed, 
respected, and involved in their own care— 
and whose wishes are heard, if not acted 
upon, during their healthcare journey. Patient- 
centered care complements evidence-based 
medicine by including patient preferences into 
the decision making about treatment options. 
Berwick, described three maxims of patient- 


1 Health CfA. A New Definition of Patient Engage- 
ment. What is Engagement and Why is it Impor- 
tant? 2010. Available from: » http://www.cfah.org/ 
file/CFAH_Engagement_Behavior_Framework_ 
current.pdf 
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centered care: (1) “The needs of the patient 
come first.” (2) “Nothing about me without 
me.” (3) “Every patient is the only patient” 
(Berwick 2009). 
Patient-centered care represents a shift in 
the physician’s role from paternalistic and 
authoritative to collaborative--leveraging the 
perspectives of people and their support 
system, whether the support system consists 
of family, caregivers, or even technology, as 
partners in making decisions and delivering 
care. Patient-centered care requires the entire 
healthcare team to be more mindful, informa- 
tive, and empathic, and for patients to actively 
participate in their care. Patient-centered care 
encourages inclusiveness and engagement for 
shared decision making among the stakehold- 
ers to develop a comprehensive care plan 
aligned with the whole person. 
The maxims of patient-centered care form 
the basis of the field personal health informat- 
ics, as originally proposed by Warner Slack 
and Tom Ferguson in 1993 (Demiris 2016). In 
particular, these maxims translate into the fol- 
lowing desiderata: 
= People are able to access care that is coor- 
dinated and collaborative 
= Care is focused on the whole person, not 
just the physical comfort 

= Care considers people’s values, culture, 
and socioeconomic status 

= People and their support system are active 
partners in care, not passive listeners 

= People’s goals within the healthcare sys- 
tem align with the system’s mission, val- 
ues, and quality metrics 

= People and their caregivers participate in 
shared decision making with their provid- 
ers and play a role in the decisions at the 
personal, population, and system level 

= Sharing health information with people 
and caregivers enables informed decision 
making 

= Support systems’ presence in the care set- 
ting are encouraged and facilitated 


As described in » Sect. 13.5, these desiderata 
provide a new lens through which the whole 
of biomedical informatics should be viewed. 


11.2.1 Using Biomedical 
Informatics to Impact 


Patient-Centered Medicine 


To conclude this section, we provide a few 
illustrative examples of how patient-centered 
care and personal health informatics can help 
shape the care delivered. 

1. Patient-centered ambulatory care. 

Routine ambulatory care, by its very 
nature, focuses on the whole person, and 
not just their diagnosis. Caring for the 
whole person requires the ability to utilize 
resources such as social workers, financial 
counselors, mental health providers, trans- 
portation, peer support programs, daily 
living assistance, and language and liter- 
acy education and resources. Making the 
provider aware of the needs of the person 
being cared for can be enabled through 
electronic health records (EHRs) and clin- 
ical decision support that work within the 
provider workflow, potentially utilizing 
data provided by the patient or the patient’s 
social and family network. In addition, 
using tools like a patient portal, the health- 
care system can provide alerts and remind- 
ers to patients for care such as the Influenza 
vaccine, nutrition counseling, and medica- 
tion refill reminders. 

Access to care can be facilitated 
through telemedicine and telehealth (see 
> Chap. 20), as well as through apps that 
could enable daily living assistance and 
peer support programs (see > Chap. 19). 
All of these applications need to consider 
language and literacy (health and technol- 
ogy), as well as challenges created by asyn- 
chronous healthcare-related discussions. 

Example: Jane uses her mobile app to 
remind her when to fill her asthma medica- 
tions and communicate with her social 
worker and financial counselor to help her 
purchase those medications. She uses the 
same app to talk with her support system 
and other people with asthma who can 
understand and support her journey 
through the stages of her disease. 
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2. Patient-centered acute care and care tran- porting, and behavioral factors can affect 


sitions. 

Acute care settings are characterized 
by sudden and rapid changes in patient 
status, ad hoc appointments, and frequent 
handoffs of care. To help patients under- 
stand what is happening to them and to 
facilitate decision-making, patients often 
request unrestricted and continuous access 
to their social support network in this set- 
ting. Patients and their support system 
should be present during rounds, which 
are performed at the bedside, and at shift 
changes. Family interaction should take 
place in an environment that is as comfort- 
able as possible, equipped with access to 
information and to experts as necessary. 
Information technology can help make 
this scenario possible through inpatient 
personal health records, where patients 
and their families can view information in 
real time to help participate in rounds and 
make informed decisions (Huerta et al. 
2017; O’Leary et al. 2016; Prey et al. 2014). 
Mobile devices can improve communica- 
tion, care plan management, and knowl- 
edge transfer, even when family members 
are not present. Other personal informat- 
ics tools, such as wearable sensors or smart 
scales, could be introduced during a hospi- 
tal stay to develop behavior changes that 
could carry over into the home environ- 
ment (Steinberg et al. 2013). 

Example: Through her mobile phone, 
Beverly’s family was present remotely in 
the hospital during rounds to ask the 
health care team questions. Her family 
also encouraged Beverly to learn to use the 
hospital’s smart scale, which can monitor 
her weight and keep her heart failure under 
control. Beverly now uses her smart scale 
daily at home to send updates to her fam- 
ily and healthcare providers who can 
encourage her and help her stay out of the 
hospital. 


. Patient-centered care at home. 


A significant amount of health care 
can and should occur in the home setting. 
This setting is where social, financial, sup- 


a person’s health care and management. 
Applications like patient portals or per- 
sonal health records can enable a person to 
review lab results and clinical notes, com- 
municate with their healthcare providers, 
schedule appointments, pay bills, and 
obtain educational information. Wearable 
technologies enable self-management and 
monitoring of critical information, such as 
weight, blood pressure, glucose levels, and 
medication adherence (Marcolino et al. 
2018). Finally, informatics applications 
could deliver information that can help 
augment knowledge about disease status 
that can enable people to make informed 
decisions about the need to escalate home 
care or schedule a return visit (Asnani 
et al. 2016). 

Example: Richard knows a lot about 
his heart failure, but when given the right 
personal health informatics tools, he 
learned about signs and symptoms when 
his heart failure gets worse. He avoids 
costly readmissions to the hospital by 
remembering to take his medications daily 
thanks to reminders from his wearable 
technologies, and now knows when to 
communicate with his providers through 
his patient portal when his heart failure 
worsens. 


. Personalized medicine. 


At its core, the maxims of patient-cen- 
tered care require that any management 
plan provided by healthcare providers to 
patients will need to be personalized (see 
> Chap. 28). Medications, procedures, 
supportive or curative plans all should be 
tailored to the person/family receiving 
them. However, the augmentation of our 
knowledge about the impact of a person’s 
omics (genomics, proteomics, metabolo- 
mics, etc.), and environment can now be 
used to personalize therapy (Collins 2004). 

Example: Using sensors on James’ 
phone along with his genetic information, 
his providers can use targeted medications 
to treat his cancer and can help him meet 
his treatment goal of golfing again. 
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11.2.2 Limitations of Patient- 
Centered Care 


As with any change in the locus of control, 
the transition from authoritative to collabora- 
tive decision-making that is the hallmark of 
patient-centered care raises some concerns. 
For example, there are concerns that patient- 
centered medicine conflicts with evidence- 
based medicine (Weaver 2015). One of the 
current challenges in medicine is to bring 
these two worlds together, which could be 
accomplished using more sophisticated 
searching of the literature or mining EHR 
data to uncover evidence supporting devia- 
tions in care (Gallego et al. 2015). Another 
concern is that physicians are stewards of 
social resources, but some would argue that 
physicians do not know the social responsibil- 
ity of patient-centered medicine (Berwick 
2009). A third concern is the juxtaposition 
between a patient needing and wanting 
improved access to the healthcare system, and 
a healthcare system that is already both 
expensive to run, in part because of fragmen- 
tation and attempts to improve access (see 
» Chap. 29) (Enthoven 2009). 

Aside from clinician concerns, there are 
design constraints on the healthcare system 
that limit patient-centered medicine. First, the 
system must support a shift in the locus of 
control of care decisions towards patients and 
their caregivers. Developing health informat- 
ics tools such as delivering information 
through mHealth and providing decision aids 
for people can facilitate informed decision 
making about care. Second, transparency of 
care options and their associated costs and 
outcomes needs to extend to all components 
of care, including research, and education. 
While informatics researchers are working on 
these issues, further work is needed in this 
area, especially in providing transparent care 
coordination choices and data liquidity. 
Third, individualization and customization 
need to be design targets, creating health care 
and health informatics systems and tools that 
can adapt to the individual needs and circum- 
stances of people. Research in personal health 
informatics demonstrates the importance of 


personalization and individualization, but sig- 
nificant work remains. Finally, it is critical to 
train young healthcare professionals in the 
expectations of their profession related to 
patient-centered care. Education informatics 
can help bridge gaps between what young 
health professionals currently know and prac- 
tice and what a patient-centered care model 
could look like with an appropriate curricu- 
lum of educational modules, knowledge test- 
ing, and practice modeling. 


11.3 Historical Perspective 
of Personal Health Informatics 


In essence, what we think of as consumer/ 
patient engagement reflects a shift from the 
person as the silent recipient of ministrations 
from a wise, beneficent clinician, to an active 
collaborator whose values, preferences, and 
lifestyle not only alter predisposition to cer- 
tain illnesses but also shape the characteristics 
of desirable treatments. In this section, we will 
describe the ancient concept of paternalism, 
the rise of patient-centered medicine, and how 
personal health informatics has supported 
and enabled the ability of the person who 
used to have health care enacted upon them, 
to be an engaged and active participant in 
their health care. 


Paternalism and 
Professionalism of Medicine 
and Informatics 


11.3.1 


Paternalism is thought to go back as far as the 
history of medicine. Hippocrates was the 
father of medical paternalism as he wrote that 
physicians should conceal most things from 
the patient including the patient’s present and 
future condition. He believed that medical 
knowledge be kept secret from patients. The 
Hippocratic oath, which is recited today by 
medical students, is silent about the communi- 
cation between the doctor and patient relevant 
to the patient’s treatment. Paternalistic medi- 
cine continued through medieval times where 
patients were told to honor doctors since doc- 
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tors received their authority from God and 
patients must promise obedience. 

Paternalistic medicine is about keeping 
information primarily in the hands of the 
physician and medical system and in certain 
cases giving misinformation to patients to 
keep accurate information from them. 
Healthcare decisions are made by the physi- 
cian and medical system, and patients are 
expected to abide by these decisions with no 
or minimal input. In the 16th and 17th centu- 
ries, some physicians started to acknowledge 
that patients might have a voice in their care. 
However, doctors of eminence, like Dr. 
Benjamin Rush, wrote that doctors could 
yield to patients in matters of little conse- 
quence, but maintain an inflexible authority 
over them in matters essential to life. 

Paternalism is still present today. Most 
biomedical publications are inaccessible and 
costly without a subscription. Getting com- 
plete medical records can be difficult, even 
when through conventional information tech- 
nology applications. Informed consent, in 
many cases, is not sufficiently explained to the 
person having the treatment for the person to 
understand all the risks, benefits, other 
options, and details of what is being done. 
The discharge process is perhaps the best 
example of present-day paternalism, where 
patients have little say about their readiness 
for discharge and often are made responsible 
complying with complicated discharge 
instructions. Patients who believe they are 
ready before their care team agrees, or who 
are dissatisfied with their care, are required to 
sign a form attesting to leaving against medi- 
cal advice. The final source of paternalism 
comes from guidelines that healthcare 
providers get to guide their decisions. These 
guidelines, discharge forms, and other pro- 
cesses typically do not involve patients in their 
creation. 

Although early information systems were 
almost exclusively provider-centric, recent 
advances in computer system availability has 
prompted the development of less paternalis- 
tic tools that may be used by people and fami- 
lies, as described below. 
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11.3.2 The Rise of Patient-Centered 
Medicine and Personal 
Health Informatics 


Enid Balint coined the term patient-centered 
care in 1969 (Balint 1969). A few pioneers of 
patient-centered medicine include Barbara 
Korsch who explored the listening skills of 
physicians in training (Korsch 1989), John 
Ware who discovered the components of 
patient satisfaction (Ware Jr et al. 1983), 
Debra Roter and Judith Hall who described 
the properties and dysfunction of doctor- 
patient communication and how to improve 
this communication (Roter and Hall 2006), 
Howard Waitzkin and John Stoeckle who 
demonstrated how to tap into patient’s views 
and knowledge of their symptoms to what 
causes them could lead to improved doctor- 
patient relationships (Waitzkin and Stoeckle 
1972), Michael Barry, Jack Fowler, Al Mulley, 
Joseph Henderson, and Jack Wennberg, who 
developed shared decision making theory and 
technology and associations with improved 
outcomes (Barry et al. 1995), and Judith 
Hibbard who helped us understand patient’s 
desires for knowledge (Hibbard et al. 2007) 
and advanced our knowledge and tools about 
patient engagement and activation (Hibbard 
and Greene 2013; Hibbard et al. 2004). Other 
landmarks in this paradigm shift included 
Engel’s proposal to “take into account the 
patient” (Engel 1977), Cassell’s transcriptions 
of clinical encounters, which provided a basis 
to understand the doctor—patient relationship 
(Cassell 1985), and Kleinman’s definitions of 
“disease” and “illness” as the patient’s subjec- 
tive experience of feeling ill (Kleinman 1988). 

Personal Health Informatics can be traced 
back to the early twentieth century, where the 
U.S. Federal Children’s Bureau served as a 
major source of health information for the 
public. Mothers could write to this federal 
agency, asking questions about normal child 
development, nutrition, and disease manage- 
ment. Written materials, such as letters and 
pamphlets served as the primary mechanism 
for delivering information that supported lay 
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people in their handling of health challenges. 
Patient education companies such as Krames 
would partner with organizations like the 
American Heart Association to provide gen- 
eral printed material on heart disease or with 
the American Cancer Society to provide infor- 
mation on cancer. 

Personal Health Informatics applications 
followed a similar trend as patient-centered 
medicine with early applications in the 1950’s 
and 1960’. Collen and colleagues at Kaiser 
Permanente created one of the earliest patient 
data collection applications--a health 
appraisal system that prompted for patient 
data and returned a systematic risk appraisal 
(Collen et al. 1964). Warner Slack and 
colleagues at the University of Wisconsin 
used a mainframe computer system as a 
health assessment tool. Patients sat at a cath- 
ode ray tube (CRT) terminal and responded 
to text questions, receiving a printed summary 
of their health appraisal at the end of the ses- 
sion (Slack et al. 1966). @ Figure 11.1 shows 
an early example of the mainframe-based tool 
developed by Slack. At Massachusetts 


O Fig. 11.1 
health informatics, here taking a medical history directly 
from a patient (Slack WV et al. (1966). NEJM with per- 
mission) 


Early use of computing in consumer 


General Hospital in the late 1950’s, computer- 
driven telephone systems were used to con- 
duct home-based follow-up with post-surgery 
cardiac patients, calling them daily to obtain 
pulse readings. This was followed by an era of 
interactive video systems that augmented 
delivery of information and helped patients 
understand the risks and benefits associated 
with treatment options, but also to help define 
their values for possible future health out- 
comes. The prime examples of this type of 
system originated with the Foundation for 
Informed Medical Decision Making. As early 
as 1973, Wennberg and others discovered that 
the rates of many expensive surgeries and 
other treatments would vary from location to 
location throughout the U.S. (Wennberg and 
Gittelsohn 1973). This variation seemed to 
occur for medical conditions where there were 
multiple viable treatment options and choices 
depended more on physician and resource 
availability than need or patient characteris- 
tics. Barry et al. focused on developing inter- 
active video consumer decision aids that 
focused on these conditions (e.g., prostate 
cancer, breast cancer, back pain, etc.), discov- 
ering that patient preferences and priorities 
for possible health outcomes could vary dra- 
matically from person to person, and could be 
critical to defining an optimal decision (Barry 
et al. 1995). 

In the 1980s, clinicians and health educa- 
tors capitalized on the increasingly common 
personal computers as vehicles for health edu- 
cation. As shown in @ Fig. 11.2 The Body 
Awareness Resource Network (BARN), 
developed in the 1980s by Gustafson and col- 
leagues at the University of Wisconsin, 
engaged adolescents in game-like interactions 
to help them learn about growth and develop- 
ment, develop healthy attitudes towards 
avoidance of risky behaviors, and rehearse 
strategies for negotiating the complex inter- 
personal world of adolescence (Bosworth 
et al. 1983). 

Another influential early personal health 
informatics application was the 
Comprehensive Health Enhancement Support 
System (CHESS) developed by Gustafson 
and colleagues in 1989 at the University of 
Wisconsin (Gustafson et al. 2001). CHESS 
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O Fig. 11.2 BARN topic index a BARNY’S TOPIC INDEX 
and use by teens (Bosworth et al. WHOTO 
1983). This picture shows teens CALL FOR BODY CARE STRESS 
interacting with a game on an 
early graphical computer. The SED 
figure on the left is the topic rz Be. 
index as displayed on the screen. 2 3 
(With permission from 1 
Gustafson, D, personal 
communication) 
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provided women with breast cancer informa- 
tion through curated articles and directories 
of cancer services, decision-making through 
charts, decision aids, and action plans, and 
emotional support through online support 
groups. Ferguson was also heavily influential 
in the 1980’s and 90’s in creating and analyz- 
ing online social support groups for patients 
(Ferguson 1996). 

As the Internet became more available in 
homes, Internet support groups gathered 
momentum. One such example, Hopkins Teen 
Central developed by johnson et al. (2001), 
allowed otherwise isolated children with cys- 
tic fibrosis to meet virtually and to discuss 
health and developmental issues that impacted 
healthy decision making. The idea of the 
Internet support group became an active area 


of development, continuing through today 
(Eysenbach et al. 2004). During this time, 
computer games also increased in uptake, and 
while time-consuming and expensive to build, 
they were relatively easy to disseminate, and 
were associated with measurable changes in 
knowledge (Lieberman 1988) and, in some 
cases, symptom management (Patel et al. 
2006; Redd et al. 1987). 

Some major areas of growth in personal 
health informatics over the past 20 years 
include home telehealth, mobile health, per- 
sonal health records, and personal genomics. 
The growth of home telehealth technologies 
has grown rapidly over the past 20 years (see 


> Chap. 20). Some notable randomized 
controlled trials include Informatics for 
Diabetes Education and Telemedicine 
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(IDEATel), the Telemonitoring Study for 
Chronic Obstructive Pulmonary Disease 
(COPD), and the Tele-ERA study. Large ran- 
domized controlled trials demonstrated 
improved effects of these interventions. 
Personal health records started in the late 
1990s with the Patient-Centered Access to 
Secure Systems Online (PCASSO) portal 
(Masys and Baker 1997). Tethered personal 
health records, commonly referred to as 
patient portals, have been implemented by 
hundreds of institutions, with increasing 
adoption being driven by governmental policy 
such as the Affordable Care Act and 
Meaningful Use in the US, and the Power of 
Information strategy in the UK. There has 
been increasing literature demonstrating 
increased uptake and use of patient portals, 
and also improvement in patient satisfaction, 
communication, and outcomes (Ammenwerth 
et al. 2012; Goldzweig et al. 2012). With the 
advent of mobile technologies, such as smart 
phones and tablets, mobile health (mHealth) 
has exploded over the past 5 years (see 
> Chap. 19). While significant work has been 
done in the U.S. utilizing this technology, a 
significant push of mHealth has occurred in 
low- and middle- income countries because of 
the ubiquitous nature and low cost of mobile 
phones as compared to other forms of tech- 
nology. With the increased ability to record 
and review daily activities through mobile 
technologies, a movement called the 
Quantified Self has evolved (Appelboom et al. 
2014). The Quantified Self movement, driven 
by the theory of patient engagement, is a fast 
growing practice of self-monitoring driven by 
technological advances and breakthroughs in 
miniaturization of wearable and environmen- 
tal sensors. Personal genomics first became 
available in the early 2000s because of direct 
to consumer genetic testing. Companies like 
23andMe, » Ancestry.com, and Pathway 
genomics provide the ability to test one’s own 
genetic composition at home to discover 
genetic risk for diseases like breast cancer, 
ancestry, and pharmacogenomic information. 
Issues with regulatory bodies such as the Food 
and Drug Administration has prevented all 
information about genetic risk to be provided 


to customers of these companies for concern 
about false information (see ® Chap. 28). 

Patient-centered medicine and Personal 
Health Informatics have become and will con- 
tinue to be a bigger and more important driv- 
ing force in medicine. In 1998 the Institute of 
Medicine (IOM) established a major program 
on Quality of Health Care in America and 
placed patient-centered as one of the six aims 
in their landmark paper on improving the 
quality of health care, Crossing the Quality 
Chasm (Institute of Medicine (US) Committee 
on Quality of Health Care in America 2001). 
Many studies have demonstrated improve- 
ment in care using patient-centered medicine 
including classic medical outcomes (Epstein 
and Street Jr 2011), improved shared decision 
making (Golomb et al. 2007), and reducing 
unnecessary surgical operations.” Personal 
Health Informatics has also demonstrated 
important improvements in medical care 
(Gibbons et al. 2009). In a Medline search of 
“patient-centered care”, tens of thousands of 
articles have been published about patient 
centered medicine, with only 59 from 1950 to 
1992. A similar search for “consumer health 
informatics” OR “personal health informat- 
ics” demonstrates hundreds of articles, with 
only 4 articles from 1950 to 1992. In recogni- 
tion of the growth of scientific studies in this 
domain, in 2008 the MeSH term “Consumer 
Health Information” was introduced, defined 
as “information intended for potential users 
of medical and health care services” (Demiris 
2016). As people increase their engagement in 
their health, and technologies improve their 
ability to do so, personal health informatics 
will become a bigger and more important part 
of biomedical informatics and the sub- 
disciplines within it. 


2 O’Connor A, Stacey, D, Rovner, D, Holmes-Rovner, 
M, Tetroe, J, Llewellyn-Thomas, H, Entwistle, V., 
Tait, V, Rostom, A, Fiset, V, Barry, M. Institute for 
Healthcare Improvement: Patient decision aids for 
balancing the benefits and harms of health care 
options: A systematic review and meta-analysis 
2018 [cited 2018 June 22]. Available from: » http:// 
www.ihi.org/resources/Pages/Publications/Patient- 
DecisionAidsforBalancingBenefitsHarmsofHealth- 
CareOptions.aspx 
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O Fig. 11.3 Analytic framework showing the interplay between social, cultural, and behavioral features and the 


opportunities for personal health informatics on outcomes 


11.4 Important Concepts 
in Personal Health Informatics 


In this section we review how social and cul- 
tural, economic and financial, education, lan- 
guage and literacy, environmental and 
behavior factors influence and mediate health 
outcomes for patients and consumers of 
health care. @ Figure 11.3 shows a frame- 
work for thinking about how Personal Health 
Technology can influence outcomes from a 
variety of stakeholder’s points of view within 
a context of social and cultural factors 
(Jimison et al. 2008; Keselman et al. 2008). 
Interactive consumer health technology 
applications have had an increasingly impor- 
tant role in health care. Work based on the 
IOM’ Crossing the Quality Chasm report 
(Institute of Medicine (US) Committee on 
Quality of Health Care in America 2001) 
focused on supporting self-management by 
encouraging providers to use education and 
other interventions to systematically increase 
patients’ skills, confidence, and empowerment 
in managing their health problems (Holman 
and Lorig 2004). Two specific initiatives 
include patient-centered care and informatics. 
As described in > Sect. 13.2, patient-centered 
care aims to inform and involve patients and 
their families in decision making and self- 


management, coordinate and integrate care, 
provide physical comfort and emotional sup- 
port, understand patients’ concepts of illness 
and their cultural beliefs, and understand and 
apply principles of disease prevention and 
behavioral change appropriate for diverse 
populations. Informatics aims to communi- 
cate, manage knowledge, and support the use 
of information technology for decision mak- 
ing (Jimison et al. 2008). Examples of these 
informatics tools include home monitoring 
systems with interactive disease-management 
or self-management technology, educational 
or decision-aid software that is interactively 
customized to the patient’s needs, online 
patient support groups, tailored interactive 
health reminder systems where interactions 
are linked with electronic health records, and 
patient-physician electronic messaging. These 
types of tools may be implemented on a vari- 
ety of platforms using Web/Internet technol- 
ogy, touch screen kiosks, mobiles phones, or 
combinations of these. The location where 
individuals access the information also widely 
varies — ranging from clinics, hospitals, home, 
workplace, or any mobile location. However, 
many factors relating to how the technology is 
deployed to the consumer or health system 
can influence access, usability, and effective- 
ness. Many studies have demonstrated health 
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outcome disparities related to race and ethnic- 
ity, income, and education. With the increas- 
ing availability and use of personal health 
technologies, we have an opportunity to 
reduce these disparities with appropriate tar- 
geted design choices. The following sections 
identify some of the design challenges. 


11.4.1 Health Literacy 


and Numeracy 


Literacy skills play an important role in navi- 
gating the healthcare system, in learning 
about health and medical concerns, and in 
using personal health technologies. Health lit- 
eracy is defined as “the degree to which indi- 
viduals can obtain, process, and understand 
the basic health information and services 
they need to make appropriate health 
decisions” (Berkman et al. 2011). Several skills 
are required for an individual to appropriately 
integrate healthcare information and function 
effectively in the healthcare environment. One 
must be able to understand written material 
(print literacy), understand graphs and 
numerical quantitative information (numer- 
acy), and be able to both speak and listen 
effectively (oral literacy). Low health literacy 
is a significant problem in the United States. 
In 2003, approximately 80 million adults in 
the United States (36 percent) had limited 
health literacy. Certain population subgroups 
have higher rates of limited health literacy. 
For instance, rates are higher among older 
adults, minorities, individuals who have not 
completed high school, adults who spoke a 
language other than English before starting 
school, and people living in poverty. 
Highlighting the health impact of low health 
literacy, a 2004 systematic evidence review 
found a relationship between low health liter- 
acy and poor health outcomes (Berkman et al. 
2011). Specifically, lower health literacy (mea- 
sured by reading skills) was associated with 
lower health-related knowledge and compre- 
hension, higher hospitalization rates, poor 
global health measures, and certain chronic 
diseases. There are important considerations 
that personal health informatics must address 


to overcome these literacy issues such as 
adaptable interfaces and intuitive icons and 
graphics; however, designers need to give these 
considerations priority. 


11.4.2 Digital Divide 


Concerns about a “digital divide” between the 
“haves” and “have-nots” have long existed, 
mainly focused on economic access to the 
technology. And certainly, access to informa- 
tion technology is now seen as an important 
component to quality health care. Thus, dis- 
parity in access leads to disparity in health 
outcomes. However, there are many potential 
causes of a digital divide in addition to 
income. Studies have shown links to educa- 
tion, ethnicity, gender, urban/rural geography, 
age, and culture (Carr 2007; Ernest III et al. 
2004; Kontos et al. 2014; Mossberger et al. 
2006; Neter and Brainin 2012; Wensheng 
2002). The terms “digital native” and “digital 
immigrant” are often used to characterize 
those generations of people who were born 
into a digital world, versus those who have 
experienced the migration from non-digital to 
digital information. Mobile phone and smart- 
phone access has changed the digital divide 
trend somewhat recently with over 95% of the 
global population having access to a mobile 
phone (Fehske et al. 2011), and a rapidly 
growing number in all sectors having access to 
smartphones. Interestingly, in the U.S. the rate 
of smartphone adoption among Blacks and 
Hispanics outpaces that among Whites. With 
more personal health informatics systems 
using mobile phone interfaces, we may better 
address the digital divide issue in the future. 
The most challenging issues at this point relate 
to available mobile and smartphone band- 
width and education. 


11.4.3 Chronic Conditions 


Approximately 120 million Americans have 
one or more chronic illnesses, accounting for 
70 to 80 percent of healthcare costs. Twenty- 
five percent of Medicare recipients have four 
or more chronic conditions, accounting for 
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two thirds of Medicare expenditures 
(Hoffman et al. 1996; Wagner 2001). Most 
patients with chronic conditions such as 
hypertension, diabetes, hyperlipidemia, con- 
gestive heart failure, asthma, and depression 
are not treated adequately, and the burden of 
chronic illness is magnified by the fact that 
chronic conditions often occur as comorbidi- 
ties (Bodenheimer et al. 2002c; Wagner et al. 
2001). One key element of systems-oriented 
chronic care models is support of patient self- 
management in the home environment 
(Bodenheimer et al. 2002b). Such 
self-management support can reduce hospi- 
talizations, emergency department use, and 
overall managed care costs (Bodenheimer 
et al. 2002a; Coleman and Newton 2005; 
Lorig et al. 2001; Renders et al. 2001; Whitlock 
et al. 2000). 


11.4.4 Conditions Associated 
with Aging 


A great many elderly persons receiving care 
have functional limitations, such as reduced 
sensory, cognitive, or motor capabilities and 
may require disease management for multiple 
chronic conditions. Although personal health 
informatics has the potential to empower 
patients to become more active in the care 
process, the elderly may be disadvantaged 
unless the designers of both software and 
hardware technology consider their needs 
explicitly (Demiris 2016). Usability and acces- 
sibility issues are important quality criteria 
for Web-based and mobile interventions (see 
> Chap. 5), but often neglected by designers 
and evaluators (Eysenbach et al. 2002). 


11.4.5 Behavior Management 


Many patients have believed in the concept of 
health prevention through wellness activities 
(active lifestyle development, stress reduction, 
weight control) long before healthcare profes- 
sionals endorsed this mode of self-care. With 
the advent of preventive medicine and data 
supporting the role that wellness activities can 
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play in maintaining health, many of these 
tenets have become a part of the armamen- 
tarium for disease management and are an 
area of discovery supported by the Agency for 
Healthcare Research and Quality (AHRQ) 
and other Federal institutes. 

A complete discussion of foundational 
models of behavior change is beyond the 
scope of this chapter, but key works are listed 
in @ Table 11.1. Researchers and educators 
capitalize on these theories to reduce risky 
behaviors (e.g., cigarette smoking, unpro- 
tected sexual intercourse, and unhealthy eat- 
ing) and to promote desirable health behaviors 
(i.e., referred to as behavior change). 

Beginning in 2007, the Robert Wood 
Johnson Foundation, in the Project 
HealthDesign Initiative (Brennan et al. 2007), 
catalyzed the development of personal health 
applications, with the belief that a properly 
developed common platform would be essen- 
tial to the spread of intelligent, interoperable 
and theoretically-based behavior change 
tools. This initiative demonstrated many tools 
that could help consumers with behavior 
change. These demonstrations leveraged the 
widespread adoption of “smart” phone tech- 
nology across geographic and socioeconomic 
divides. This widespread adoption, coupled 
with easy to use software development envi- 
ronments, enables the development of per- 
sonal health applications that operate as 
stand-alone or integrated tools available to 
most consumers. 


11.5 The Impact of Personal Health 
Informatics on Biomedical 
Informatics 


It is almost axiomatic in biomedical informat- 
ics that the introduction of a new discipline or 
information user category into the healthcare 
system induces change. Such is the case with 
the inclusion of person-centered care into bio- 
medical informatics. The unique aspects of 
what distinguishes health system-generated, 
provider-generated, and person-generated 
data from each other permeate all aspects of 
biomedical informatics. These unique charac- 
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O Table 11.1 Models of health behavior change 
Name and source Summary 


Self-efficacy [Bandura 


An individual’s impression of one’s own knowledge and skill to perform any 


1977] task, based on prior success, physical ability, and outside sources of persuasion. 
Predicts the amount of effort a person will expend to change behavior. It is a key 
component of other theories, such as the Theory of Planned Behavior. 


Social cognitive theory 
[Bandura 1989] 


Theory of planned 
behavior [Ajzen 1985] 


Behavior change is determined by personal, environmental and behavioral 
elements, which are interdependent. 


A link between attitudes and behavior. It asserts that behaviors viewed positively 
and supported by others (subjective norm) are more likely to have higher levels 


of motivation and more likely to be performed. 


Transtheoretical/stages of 
change model [Prochaska 
2005] 


Patient engagement/patient 
activation [Dentzer 2013; 
Hibbard 2004] 


This model asserts that behavioral change is a 5-step process, between which a 
person may oscillate before achieving complete change. 


Patient engagement describes the actions one can take to achieve maximum 
benefits from the healthcare services available. Patient activation is a person’s 
knowledge, confidence, and skills used to manage their health. Improved 


engagement and activation have been associated with improved healthcare 


outcomes 


Early games targeted at behavior change (described above) attempted to remove the stigma (attitude) associ- 
ated with engaging in healthy behaviors (such as taking medications to combat a chronic illness) 


teristics impact both the development of 
methods to acquire and store data, as well as 
applications of these methods to impact care 
and discovery. To provide a frame of reference 
about the impact of person-centered care on 
the field, we will describe the magnitude of 
change using selected examples covered else- 
where in the book. 


11.5.1 Data Science 


The field of data science has been impacted 
greatly by personal health informatics meth- 
ods and advances. This discipline, which is 
focused on the use of scientific processes, 
algorithms and systems to extract knowledge 
and insights from both structured and 
unstructured data, addresses a number of 
concerns inherent in so-called “big data” 
(Provost and Fawcett 2013). These data come 
in large amounts (volume), with often fast 
speeds of arrival and (velocity), varying for- 
mats (variety), varying timing (variability), 
unclear accuracy/validity (veracity), and sig- 
nificant risks to patient privacy (vulnerabil- 


ity). Data collected by patients who may be 
less familiar with healthcare terminology add 
a layer of complexity to the variety, veracity, 
and vulnerability challenges inherent in these 
data. Data collected using commonly avail- 
able sensor technologies add to the velocity 
and volume challenges, but also present new 
opportunities. These, and other new chal- 
lenges imposed by the addition of personal 
data into the healthcare/discovery system are 
new opportunities for personal health infor- 
matics research.’ 


11.5.2 Precision Medicine 


Perhaps no area has received greater atten- 
tion in the last 10 years than precision medi- 
cine. The goal of precision medicine as noted 


3 Technology OotNCfHI. Conceptualizing a Data 
Infrastructure for the Capture, Use, and Sharing of 
Patient-Generated Health Data in Care Delivery 
and Research through 2024, 2018. Available from: 
> https://www.healthit.gov/sites/default/files/onc_ 
pghd_final_white_paper.pdf 
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in the Mission statement of the NIH’s All of 
Us initiative, is “to enable a new era of medi- 
cine through research, technology and polices 
that empower patients, researchers and pro- 
viders to work together toward develop- 
ment of individualized treatments” (All of 
Us Research Program Investigators 2019) 
(» https://obamawhitehouse.archives.gov/ 
precision-medicine) (see > Chap. 28). Patient 
empowerment in this new era of medicine will 
take many forms, and will question how we 
help patients with concepts such as contribut- 
ing personal data to empower discovery, under- 
standing a radically new way to subcategorize 
diseases according to patient-specific attributes, 
learning a new “trust” model to understand 
why two patients with the same disease may 
have very different treatment plans, and even 
recognizing an expanding role for consum- 
ers in establishing and critiquing health policy 
(Adams and Petersen 2016; Juengst et al. 2012). 


11.5.3 Ethical, Legal and Social 
Issues 


As our understanding about the implications 
of adopting a person-centered care philoso- 
phy mature, so do the ways in which this phi- 
losophy should govern the treatment of the 
person and her data and information. This 
philosophy introduces new ethical, legal and 
social issues that are deserving of thought to 
maximize the effective and safe use of these 
data and applications (see ® Chap. 12). For 
example, concerns raised by the access to 
decision-making applications became the 
impetus for thoughtful discussion about the 
timing of FDA regulation of mHealth apps 
(Lye etal. 2018). 

One of the most exciting, though poten- 
tially alarming consequences of our extensive 
use of the Web for shopping, communicating, 
and learning is that each of us leaves behind a 
profile of who we are, what we like or dislike, 
what we know or don’t know, and what we 
want or already have. When combined with 
data mining and natural language processing 
techniques, it is possible to create highly tar- 
geted and predictive personal knowledge. 
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Search engines regularly exploit this opportu- 
nity to create a profile of each searcher and to 
improve the relevance of retrieved results 
(Frisse 1996). We can expect the use of these 
data to impact how medical care is personal- 
ized. Data created by consumers, coupled 
with ubiquitous computing, might provide 
just-in-time nutritional consults, over the 
counter medication advice, or advice that 
might prevent illnesses, such as convenient 
locations to receive a flu vaccine or when to 
begin medications for seasonal allergies (Guo 
et al. 2016; Pellegrini et al. 2018; Swendeman 
et al. 2015). 

This technology, which is likely to improve 
the user experience and functionality available 
with consumer-facing technologies, also has 
significant downside risks to patient privacy. 
In particular, data from email, internet 
searches, support group chats, genome risk 
prediction sites, and mobile or cloud-based 
apps all may be directly identifiable, or may be 
combined with data from other sources to be 
identifiable. Together, they may be used to cre- 
ate a profile of an individual, his or her health 
status, the health status of related individuals, 
or other profiles, all of which may be useful 
for targeted advertising, insurance risk, 
employability, or other purposes. Biomedical 
informatics research in the person-centered 
care era focuses on such topics as the bound- 
aries of HIPAA protection, genomic privacy, 
and re-identification risk. 

A digital divide can lead to serious ethical 
issues. If technological interventions are only 
available or usable to a segment of the popula- 
tion, this imbalance can threaten the impact 
of these technologies on improving healthcare 
for all and increase healthcare disparities. 
Certain interventions, such as mHealth and 
the Internet of Things technologies, require a 
certain digital literacy. Installing and main- 
taining these devices with the tremendous 
amount of data they generate can be daunting 
to populations with lower literacy and numer- 
acy as well as those with low proficiency levels 
for problem solving in technology-rich envi- 
ronments. Also, mobile phones made by dif- 
ferent companies can have different 
capabilities. At the present time, Android 
phones are widely used by people with lower 
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socioeconomic status. If Apple iOS apps are 
shown to be more effective than Android, lim- 
ited access may increase the disparities and 
effect of these interventions for many people. 
Finally, many mHealth apps are only available 
in English, which further increases the digital 
divide. 

While the direction that personal health 
informatics will take in the future is at best, 
educated speculation, it is clear that as long as 
patient-provider partnerships are endorsed, 
technology will be a third partner in ensuring 
that activated people manage their health and 
disease effectively. 


11.5.4 Communication 


Inherent in all aspects of information man- 
agement is the recognition about the audi- 
ence to whom information is being 
communicated. Perhaps no field has a larger 
gap between what is understood by its pro- 
fessionals and its consumers than health 
care. Numerous studies have demonstrated 
issues that persist with the age-old challenge 
of how to educate people about diseases. 
These challenges are magnified as the scope 
of consumer engagement broadens to include 
many of the topics listed above, and as we 
use data and evidence for care enters every- 
day discourse with patients (McCormack 
et al. 2013). Furthermore, the separation of 
information communication to and from 
medical professionals creates opportunities 
for misunderstanding at best, and inappro- 
priate actions being taken by patients at 
worse (Isaacs and Creinin 2003; Morgan 
2013). This is an area ripe for research and 
evaluation by biomedical informatics profes- 
sionals, much of which is already underway 
in areas such as methods to circumvent liter- 
acy and numeracy challenges. It is clear that 
there is a correlation between health literacy 
and quality of life (Zheng et al. 2018), but 
also clear that more research needs to be 
done to understand how to communicate in 
the face of this reality (Newnham et al. 2017; 
Fisher et al. 2016). 


11.5.5 Mobile Health Care 
(mHealth) 


Perhaps the most significant change in the 
landscape of personal health informatics has 
been the adoption of “smartphone” technol- 
ogy into society. Smartphones are mobile 
phones that perform many functions found 
on present-day computers. They typically 
contain a touch screen interface, camera, 
Internet access, short-range wireless intercon- 
nection technology, and an operating system 
capable of executing downloaded applica- 
tions. Smartphone ownership has grown 
worldwide, with an estimated 81% ownership 
by adults in the United States.* When com- 
bined with a new generation of wireless or 
connected peripheral technologies (imaging 
tools, wearable sensors, monitoring systems, 
etc.) downloadable applications (apps) have 
revolutionized information collection and use 
by people, and have defined a new field called 
mobile health care (mHealth) (see » Chap. 
19) (Cameron et al. 2017). 

One of the main byproducts of the 
mHealth era has been a radical improvement 
in consumer empowerment, coupled with 
information sharing that enables individual 
groups to make “informed” decisions about 
their care with or without the assistance of a 
healthcare professional. With these new capa- 
bilities come enormous opportunities for bio- 
medical informatics to influence the entire 
healthcare system. Terms such as “quantified 
self” (Dudley et al. 2015) and “Internet of 
Things” (Dimitrov 2016) begin to characterize 
the potential of mHealth. 

It is through the use of mHealth applica- 
tions that the notion of personal health infor- 
matics has grown beyond the individual’s 
clinical needs to population-level care needs. 


4 Pew Research Center. Smartphone ownership is 
growing rapidly around the world, but not always 
equally. Available from: » https://www.pewre- 
search. org/global/2019/02/05/smartphone-owner- 
ship-is-growing-rapidly-around-the-world-but-not- 
always-equally/. Accessed November 9, 2019. 
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Indeed, innovations such as Apple’s 
ResearchKit © and the explosion of wearable 
fitness trackers connected to social networks 
are designed for smartphones and mHealth 
technologies. These technologies also are 
being positioned improve the structure of 
healthcare delivery, through innovations such 
as appointment self-scheduling, direct-to- 
consumer e-consults, and peripheral devices 
that make home diagnoses commonplace 
(Topol 2015; Kawano et al. 2012). 

Like any foundational change in biomedi- 
cal informatics, the advent of mHealth creates 
new paradigms for concepts such as usability 
and usefulness. Unique characteristics of peo- 
ple (literacy, numeracy, language differences) 
must be kept in mind and considered, along 
with the capabilities of people and their liv- 
ing, working, and social environments. 
Research in user-centered design, usability 
assessment, failure modes and effects analysis, 
and other techniques to assure safe and effec- 
tive use are increasingly critical to advances in 
mHealth (see > Chap. 5) (Overdijkink et al. 
2018; Matthew-Maich et al. 2016). 

Another significant challenge for mHealth 
is integration into clinical workflows. If 
healthcare providers are not prescribing 
mHealth apps or using their data, patients 
may be less likely to use them if they cannot 
engage their provider in shared decision mak- 
ing. In other personal health informatics 
tools, such as patient portals, adoption and 
promotion of patient portal usage by provid- 
ers leads to increased usage by patients 
(Cronin et al. 2015). It will be critical to 
improve usefulness of the vast amount of 
data generated by mHealth, aid provider and 
patients in choosing and using mHealth apps, 
and determine the appropriate touch points 
of these interventions between providers and 
patients, which will enable the potential of 
mHealth in the future. 


11.5.6 Social Network Systems 


Social network systems, epitomized by 
Facebook (> www.facebook.com), are online 
virtual communities where participants 
describe themselves with member-entered 
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attributes, establish or break connections to 
other members, communicate, and share 
information. This simple strategy creates a 
virtually unlimited method to connect similar 
people to one another, and has been shown to 
be an effective tool to connect people with 
specific health needs (Moorhead et al. 2013). 
The for-profit online health-related social net- 
working community Patients Like Me has 
demonstrated that individuals with a severe 
chronic disease—amyotrophic lateral sclero- 
sis—are highly willing, even without compen- 
sation, to contribute data and observations to 
a patient community (Frost and Massagli 
2008) to accelerate learning about their dis- 
ease. The site has no ties to the conventional 
healthcare system and short-circuits the tradi- 
tional research enterprise, rewarding partici- 
pants, not just researchers, with knowledge. 
The patient outcomes of diverse therapies are 
collected using crowd sourcing, where patients 
contribute their information to a common 
database that can be queried to obtain sum- 
maries of an aggregated experience of their 
peers. 

Social networking Web sites share most or 
all of the features of electronic support 
groups, and even some data commonly pro- 
vided through a portal (through creating an 
affiliation with a group who externalizes pub- 
lic or private information). Social networking 
platforms combined with personal health 
records provide a means for social network 
members to share and aggregate data obtained 
from the traditional health system, and to do 
so in a private manner (Eysenbach 2008; 
Weitzman et al. 2011a). 

One of the features of health-related 
online social networks is the rapid dissemina- 
tion of information across a network; how- 
ever, there is great variability in the quality of 
discourse on health-related social networking 
sites. Conversations may be moderated, in cer- 
tain cases by a health coach (Jimison et al. 
2007). Conversations also may be unmoder- 
ated and commercial influences may enter the 
discourse without transparency. There are 
also concerns around privacy. Compared with 
the restrictive institutional consents and com- 
pacts with patients that limit use of data and 
specimens under federal regulations applica- 
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ble to much federally sponsored research, 
online social networks are generally governed 
by no more than a terms of use statement, 
often subject to change without notice in 
30 days. These privacy policies may be diffi- 
cult to find and not written in language acces- 
sible by a population with a broad range of 
health literacy (Weitzman et al. 2011b). 
Industry standards governing safety and pri- 
vacy of online health-related social network- 
ing are yet to emerge. 


11.5.7 Application Example: EHR 
Portals 


As the electronic health record gains accep- 
tance, its relevance to individual people also 
grows. Many hospitals and clinics have begun 
providing direct patient access to the clinical 
record. These portals are defined as person- 
facing systems tethered to electronic health 
records, allowing them views of clinical or 
claims data in institutional electronic health 
record systems or payer systems (Tang et al. 
2006; Kim and Johnson 2002). Portals pro- 
vide motivated people with a way to electron- 
ically access sections of their records to recall 
salient instructions or obtain results of tests. 
Some of the first such personal health portals 
were Columbia’s PatCIS system (Cimino 
et al. 2002) and Beth Israel Deaconess’s 
PatientSite, developed in 1999 (Weingart 
et al. 2006). Two of the most widely deployed 
portals are Epic’s MyChart (Serrato et al. 
2007) and MyHealtheVet (Nazi and Woods 
2008). Many of these portals provide capa- 
bilities besides simply viewing EHR informa- 
tion, such as secure physician-patient 
messaging, appointment scheduling, provid- 
ing educational information, and viewing and 
managing medical bills. One of the more 
recent additions to this set of capabilities is 
the OpenNotes effort started by 
MyHealtheVet, which exposes every progress 
note to patients, instead of exposing only dis- 
charge summaries. As new data types are pro- 
vided to the healthcare system by people in 
support of their health, we can expect the 
data types exposed through EHR portals to 


change, along with new EHR portal 
application capabilities. 


11.5.3 Application Example: 
Personal Health Records 


According to the Markle Foundation, a per- 
sonal health record (PHR) is “an electronic 
application through which individuals can 
access, manage and share their health infor- 
mation, and that of others for whom they are 
authorized, in a private, secure, and confiden- 
tial environment” Connecting for Health 
Personal Health Working Group (2003). Like 
the EHR portal, the PHR has become the 
foundation for developing tools to store data 
and to facilitate its reuse in ways people find 
engaging. 

The idea of a personal repository for med- 
ical information is far from new. Families with 
infants have used an immunization record 
book for decades. The immunization blue 
book is a quintessential, efficient system with 
portable information that supports entry by 
multiple providers and storage by the patient. 
Clayton Christensen, who invented the con- 
cept of “disruptive innovation,” summarizes 
the widely held promise of this technology in 
his book, the Innovator's Prescription: A 
Disruptive Solution for Health Care 
(Christensen et al. 2009). “We cannot over- 
state how important PHRs are to the efficient 
functioning of a low-cost, high quality health- 
care system.” PHRs enable users to acquire 
copies of their data from every site of care. In 
some ways, this model advances information 
flow far more than models requiring inter- 
institutional data sharing agreements. Data 
from two competing healthcare networks may 
reside in the same PHR without cumbersome 
agreements between those two networks. The 
patient asserts her claim to the data for each 
network independently. This model of data 
aggregation may promote data liquidity far 
more than competing approaches, such as 
health information exchanges, which require 
centralized management of data sharing 
agreements between networks and institutions 
(Adler-Milstein et al. 2008). 


Personal Health Informatics 


In part fueled by the knowledge gained 
during the Robert Wood Johnson 
Foundation’s Project HealthDesign, various 
companies in the early 2000’s developed com- 
mercially available personal health records. 
While some of these remain viable, many 
from that era were discontinued, largely due 
to the complex nature of establishing interop- 
erability with external data sources, as well as 
unsustainable financial models.° Recently, 
however, as data liquidity and standards pro- 
moting interoperability have been mandated 
through Federal legislation, there has been a 
resurgence of activity to create PHRs from 
both small start-up companies and large 
EHR vendors. The future is still uncertain, 
but this suite of applications continues to bea 
likely foundation for more sophisticated ser- 
vices and applications used by people in sup- 
port of their health or illness management 
(Staccini et al. 2018). 


11.5.9 Application Example: 
Sensors for Home 
Monitoring and Tailored 
Health Interventions 


With rapidly advancing technologies, sensors 
that measure and monitor are everywhere. We 
see an increasing population interested in 
monitoring their activity levels with wrist 
devices that now measure movement (con- 
verted into steps or calories burned), heart 
rate, and electrodermal activity (for stress 
level). There are wireless weight scales that are 
useful for weight management of everyone, 
and for fluid management of heart failure 
patients. Even smartphones have a myriad of 
embedded sensors useful for managing one’s 
health (e.g., GPS for location context, motion 
for activity level, light and noise level for sleep 


5 Google Official Blog. An update on Google Health 
and Google PowerMeter. Available at: » https:// 
googleblog.blogspot.com/2011/06/update-on- 
google-health-and-google.html. Accessed Novem- 
ber 9, 2019. 
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management, and voice statistics for mood 
management). 

In health care there is an increasing need 
to manage chronic conditions more effectively 
by empowering patients and family caregivers 
with more active roles in self management. 
Sensors in the home and environment (includ- 
ing wearables) provide important input to 
algorithms that infer patient state and deliver 
tailored feedback and motivational messaging. 
O Figure 11.4 shows patient-generated sensor 
data being aggregated in a local device (typi- 
cally a smartphone) and transferred with 
strict security protocols to a secure server. The 
inference algorithms in real time then gener- 
ate messaging and summary content for the 
patient, as well as a health coach, clinician, 
and remote or local family caregiver (Pavel 
etal. 2015). 

For disease management interventions 
typical sensors include wireless weight scales 
for fluid management, wireless blood pressure 
cuffs for cardiac disease, blood glucose meters 
for patients with diabetes, and peak flow 
meters for those with asthma. Additionally, 
many disease management protocols include 
weight management, physical exercise and 
medication management. @ Figure 11.5a-c 
show examples of sensor technology used in 
home health settings. Motion sensors, as 
shown in @ Fig. 11.5a, can be used to deter- 
mine real-time patient location for inferring 
context as well as for measuring walking speed 
statistics (a useful cognitive indicator) (Hagler 
et al. 2010). Sleep quality can now be mea- 
sured with varying accuracy with techniques 
ranging from accerometers in wrist fitness 
trackers to pressure sensitive bed strips placed 
under the mattress that detect heart rate, heart 
rate variability (for stress recovery at night), 
respiration, as well as total sleep time and 
sleep efficiency. Another more complex 
approach to sensing in the home involves 
imaging, as shown with the interactive video 
exercise in @ Fig. 11.5c. In this case, data 
from the Kinect camera is used to detect 
movement compared to goal state and pro- 
vide just-in-time feedback to the user (Jimison 
et al. 2015; Obdrzalek et al. 2012; Offi et al. 
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Inference Algorithms 


Health Coaching 
Platform 


Caregiver 


Sensors 


KE 


Individual 


O Fig.11.4 This diagram shows how sensor data from 
the home or environment, generated by a patient who 
may have a chronic condition or an individual with an 


Continuous Monitoring 


interest in improving health, can be used as input to a 
coaching platform to provide tailored motivation and 
feedback 


O Fig. 11.5 This series of images shows examples of 
sensors and technology used to gain information about 
an individual’s state and provide tailored just-in-time 
feedback. a shows a motion sensor near the door, sen- 
sors on a smart cane, a presence lamp, wireless blood 
pressure cuff, and an Amazon Echo. b shows an indi- 


2016). Inferences on strength, flexibility, bal- 
ance and endurance can be monitored over 
time and provided to both the user and clini- 
cian. Interactive voice messaging systems 
(e.g., Amazon Echo or Google Home) have an 
important role in home health interventions, 
both as communication devices, but also as 
sensors of voice affect. Finally, interactions 
with computers, tablets and smartphones pro- 
vide valuable information on cognitive perfor- 
mance, both specifically with adaptive 
cognitive computer games (Hagler et al. 2014; 
Jimison et al. 2010), but also with indirect 


vidual interacting with a nurse care manager using a 
remote controlled Double Robot. e shows in-home chair 
exercises with real-time feedback using data from the 
Kinect camera. (With permission from the Consortium 
on Technology for Proactive Care at Northeastern Uni- 
versity. Photo courtesy of Dr. Holly Jimison) 


measurements over time of motor speed, 
search time and cognitive load (Hagler et al. 
2011, 2014). 

The streaming data from a variety of sen- 
sors in the home and environment can be 
overwhelming from a clinical perspective. 
O Figure 11.6 shows a sample phone inter- 
face for coordinating information from a vari- 
ety of sensors (Williamson 2015). This 
example shows a summary main screen for a 
patient or caregiver with feedback on calen- 
dar to-do’s, adherence to goals, medication 
taking (an issue in taking medications noted 
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O Fig. 11.6 This example shows a main screen for a 
patient or caregiver summary information. (Reprinted 
with permission of author, S. Williamson) 


by a red “X”), level of socialization, cognitive 
function, and sleep quality (soft warning 
noted by an orange “!”). In this case, clicking 
on an icon opens a screen with further detail. 
The main goal of sensor-based systems for 
health is to facilitate health management and 
adherence to an individual’s health goals 
using known principles of health behavior 
change. This type of technology enables a 
scalable and potentially cost-effective 
approach to providing continuity of care. It 
addresses use of an often untapped resource 
of both patient and family caregiver partici- 
pation as part of the care team. 


11.6 Future Opportunities 
and Challenges 


As is exemplified by the previous section, the 
opportunities for personal health informatics 
to improve health outcomes are plentiful. 
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Even worldwide, the access to mobile phones 
is becoming nearly ubiquitous, and the afford- 
ability of health sensors and devices for 
continuous monitoring and just-in-time inter- 
vention is also improving rapidly. However, 
we also see upcoming challenges in the areas 
of payment models and equity. 


Reimbursement 
and Business Models 


11.6.1 


Many countries have global budgets for health 
care, usually managed at a regional level, 
where it is possible to allocate funds for cost- 
effective health interventions that personal 
health technologies may enable. However, 
medical care reimbursement in the United 
States is moving slowly towards value-based 
care. Healthcare systems in the U.S. require 
incentives and fairly short-term business 
model demonstrations to modify their work- 
flow and hiring practices for a new model of 
value-based care. This model of care would 
bring healthcare consumers and family mem- 
bers as integral members of the care team, 
facilitated by personal health technologies. 


11.6.2 Opportunities 
for Innovation 


As health care moves from being clinic-centric 
and hospital-centric to person-centric and 
more proactive, there are many opportunities 
for new advances in personal health informat- 
ics to facilitate this change and improve health 
outcomes. As mentioned earlier in this chap- 
ter, advances in the assessment of person state 
through new always-on sensors and improved 
computational modeling will allow more tai- 
lored and timely messaging and interventions. 

Virtual Reality and Augmented Reality 
are important innovations that can transform 
the way that individuals, especially older 
adults, are cared for. Artificial Intelligence 
innovations that could lead to more tailored 
messages for a person’s health and wellness 
could overcome barriers such as remembering 
to take their medications by targeting cues to 
improve care. Finally, fusing the information 
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from sensors could allow for improved assess- 
ment of people and their health. 

Many of the innovations, however, will 
need to be social and protocol-based. For 
example, new workflow and hiring practices 
will be needed to compensate for the data and 
information these innovations will create. An 
increased emphasis on proactive person- 
centered care to improve outcomes and reduce 
costs will necessitate better use of community 
health workers and health behavior change 
coaches that interface with both the patient 
and the clinical team. One of the most excit- 
ing, though potentially alarming consequences 
of our extensive use of the Web for shopping, 
communicating, and learning is that each of 
us leaves behind a profile of who we are, what 
we like or dislike, what we know or don’t know, 
and what we want or already have. When com- 
bined with data mining and natural language 
processing techniques, it is possible to create 
highly targeted and predictive personal knowl- 
edge. Data created by consumers, coupled 
with ubiquitous computing, might provide 
just-in-time nutritional consults, over the 
counter medication advice, or advice that 
might prevent illnesses, such as convenient 
locations to receive a flu vaccine or when to 
begin medications for seasonal allergies. We 
can expect the use of these massive data sets 
(also called “big data”) to impact how medical 
care is personalized. While the direction that 
consumer health informatics will take in the 
future is at best, educated speculation, it is 
clear that as long as patient-provider partner- 
ships are endorsed, technology will be a third 
partner in ensuring that activated consumers 
manage their health and disease effectively. 


© Suggested Readings 

Berwick, D. M. (2009). What ‘patient-centered’ 
should mean: confessions of an extremist. 
Health Affairs, 28(4), w555-ww65. This paper 
is written by Dr. Donald Berwisk, an influ- 
ential proponent of patient-centered health 
care and former administrator of the Centers 
for Medicare and Medicaid Services (CMS). 
The paper describes a number of maxims of 
patient-centered care. 


Office of the Assistant Secretary for Planning and 
Evaluation, U.S. Department of Health & 
Human Services. Conceptualizing a data 
infrastructure for the capture and use of 
patient-generated health data. Available at: 
https://aspe.hhs.gov/conceptualizing-data- 
infrastructure-capture-and-use-patient-gener- 
ated-health-data. Accessed 1 Nov 2019. This 
paper describes a project sponsored by the 
Office of the National Coordinator for Health 
Information Technology (ONC) regarding a 
data infrastructure for patients to share their 
data with caregivers, providers, researchers, 
and others according to their preferences. 

Prey, J. E., Woollen, J., Wilcox, L., Sackeim, 
A. D., Hripcsak, G., Bakken, S., et al. (2014). 
Patient engagement in the inpatient setting: A 
systematic review. Journal of the American 
Medical Informatics Association, 21(4), 742- 
750. This paper reviews literature involving 
patient engagement in the hospital setting. 
The authors identify challenges such as incon- 
sistent use of termionology and gaps ini 
knowledge regarding impact on health out- 
comes and cost-effectiveness. 

Topol, E. J. (2015). The patient will see you now: 
The future of medicine is in your hands. 
New York: Basic Books. This popular book 
describes the author’s vision for medicine 
based on patient-centered health care, mobile 
health, and consumer health informatics. 


® Questions for Discussion 

1. What is the role of the health system in 
monitoring the quality of discourse on 
online social networks? 

2. What is the optimal model for personal 
health records? Should personal health 
records display advertisements? 

3. Which populations of consumers would be 
most likely to use personal health records? 

4. Which consumer technologies do you 
think will be most influential in 
consumer-focused health informatics? 

5. What is the right balance between 
privacy of personal health information 
and ready access to it? For example, for 
an unconscious patient in the 
emergency department? 
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Ethics in Biomedical and Health Informatics: Users, Standards, and Outcomes 


© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= Why is ethics important to informatics? 

= What are the leading ethical issues that 
arise in health care informatics? 

= What are examples of appropriate and 
inappropriate uses and users for health- 
related software? 

= Why does the establishment of stan- 
dards touch on ethical issues? 

= Why does system evaluation involve 
ethical issues? 

= What challenges does informatics pose 
for patient and provider confidentiality? 

= How can the tension between the obli- 
gation to protect confidentiality and 
that to share data be minimized? 

= How might computational health care 
alter the traditional provider—patient 
relationship? 

= What ethical issues arise at the intersec- 
tion of informatics and managed care? 

= What are the leading ethical and legal 
issues in the debate over governmental 
regulation of health care computing 
tools? 


Ethical Issues in Biomedical 
and Health Informatics 


12.1 


» More and more the tendency is towards the 
use of mechanical aids to diagnosis; never- 
theless, the five senses of the doctor do still, 
and must always, play the preponderating 
part in the examination of the sick patient. 
Careful observation can never be replaced 
by the tests of the laboratory. The good phy- 
sician now or in the future will never be a 
diagnostic robot. — Scottish surgeon Sir 
William Arbuthnot-Lane (Lane 1936) 


Human values should govern research and 
practice in the health professions. Health 
informatics, like other health professions, 
encompasses issues of appropriate and inap- 
propriate behavior, of honorable and disrepu- 
table actions and intentions, of right and 
wrong. Students and practitioners of the 
health sciences, including informatics, share 
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an important obligation to explore the moral 
underpinnings, ethical challenges and social 
issues related to their research and practice. 

Ethical questions in medicine, nursing, 
human subjects research, psychology, social 
work, and affiliated fields continue to evolve 
and increase in number; nevertheless, the key 
issues are generally well known. Major ques- 
tions in general bioethics have long been 
addressed in numerous professional, schol- 
arly, and educational contexts. Ethical issues 
in health informatics are, for the most part, 
less familiar, even though certain of them 
have received attention for decades (Szolovits 
and Pauker 1979; Miller et al. 1985a; de 
Dombal 1987). Indeed, informatics now con- 
stitutes a source of some of the most impor- 
tant and interesting ethical debates in all the 
health professions. It has even been suggested 
that biomedical informatics raises so many 
such issues it could itself be used as the basis 
for a bioethics curriculum (Goodman 2017). 

People often assume that the confidential- 
ity of electronically stored patient informa- 
tion is the most important ethical issue in 
informatics. Although confidentiality and pri- 
vacy are indeed of vital interest and signifi- 
cant concern, the field is rich with other ethical 
issues, including the appropriate selection and 
use of informatics tools in clinical settings; the 
determination of who should use such tools; 
the role of system evaluation; the obligations 
of system developers, maintainers, and ven- 
dors; the appropriate standards for interact- 
ing with industry; and the use of computers to 
track clinical outcomes to guide future prac- 
tice. In addition, informatics engenders many 
important legal and regulatory questions. 

To consider ethical issues in health care 
informatics is to explore a significant intersec- 
tion among several professions—health care 
informatics per se, health care delivery and 
administration, applied computing and sys- 
tems engineering, and ethics—each of which 
constitutes a vast field of inquiry. Fortunately, 
growing interest in bioethics and computation- 
related ethics has produced a starting point 
for such exploration. An initial ensemble of 
guiding principles, or ethical criteria, has 
emerged to orient decision making in health 
care informatics. These criteria are of practi- 
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cal utility to health informatics, and often 
have broader implications for all of biomedi- 
cal informatics. 


12.2 Health-Informatics 
Applications: Appropriate 
Use, Users, and Contexts 


Application of computer-based technologies 
in the health professions can build on previ- 
ous experience in adopting other devices, 
tools, and methods. Before clinicians perform 
most health-related interventions (e.g., diag- 
nostic testing, prescription of medications, 
surgical and other therapeutic procedures), 
they generally evaluate appropriate evidence, 
standards, available technologies, presupposi- 
tions, and values. Indeed, the very evolution 
of the health professions entails the evolution 
of evidence, of standards, of available tech- 
nologies, of presuppositions, and of values. 
To answer the clinical question, “What 
should be done in this case?” one must pay 
attention to a number of subsidiary questions, 
such as: 
1. What is the problem? 
2. What resources are available and what am 
I competent to do? 
3. What will maintain or improve this 
patient’s care? 
4. What will otherwise produce the most 
desirable results (e.g., in public health)? 
5. How strong are my beliefs in the accuracy 
of my answers to questions | through 4? 


Similar considerations determine the appro- 


priate use of informatics tools. 


The Standard View 
of Appropriate Use 


12.2.1 


Excitement and enthusiasm often accompany 
initial use of new tools in clinical settings. 
Negative emotions are also common (Sittig 
et al. 2005). Based on the uncertainties that 
surround any new technology, scientific evi- 
dence counsels caution and prudence. As in 
other clinical areas, evidence and reason 


determine the appropriate level of caution. 
For instance, there is considerable evidence 
that electronic laboratory information sys- 
tems improve access to clinical data when 
compared with manual, paper-based test- 
result distribution methods. To the extent that 
such systems improve care at an acceptable 
cost in time and money, there is an obligation 
to use computers to store and retrieve clinical 
laboratory results. There is a small but grow- 
ing body of evidence that existing clinical 
expert systems can improve patient care in a 
small number of practice environments at an 
acceptable cost in time and money (Kuperman 
and Gibson 2003). Nevertheless, such systems 
cannot yet uniformly improve care in typical, 
more general practice settings, at least not 
without careful attention to the full range of 
managerial as well as technical issues affecting 
the particular care delivery setting in which 
they are used (Kaplan and Harris-Salamone 
2009; Holroyd-Leduc et al. 2011; Shih et al. 
2011). 

Clinical expert systems (see > Chap. 24) 
attempt to provide decision support for diag- 
nosis, therapy, and/or prognosis in a more 
detailed and sophisticated manner than do 
simple reminder systems (Duda and Shortliffe 
1983). A necessary adjunction of expert sys- 
tems — creation and maintenance of their 
related knowledge bases — still involves 
leading-edge research and development. 
Humans for the most part remain superior to 
electronic systems in understanding patients 
and their problems, in efficiently interacting 
with patients to ascertain pertinent past his- 
tory and current symptoms across the spec- 
trum of clinical practice, in the interpretation 
and representation of data, and in clinical 
synthesis. Humans might in the future how- 
ever not hold the upper hand in these tasks, 
and claims of their superiority must continu- 
ally be tested empirically (Blois 1980). 

What has been called the “standard view” 
of computer-assisted clinical diagnosis (Miller 
1990a; cf. Friedman 2009) holds in part that 
human cognitive processes, being more suited 
to the complex task of diagnosis than machine 
intelligence, should not be overridden or 
trumped by computers. The standard view 
states that when adequate (and even exem- 
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plary) decision-support tools are developed, 
they should be viewed and used as supple- 
mentary and subservient to human clinical 
judgment: They support decisions by human 
beings; they do not make decisions. Progress 
should be measured in terms of whether clini- 
cians using a CDS tool perform better on spe- 
cific tasks than the same clinicians without the 
tool (Miller 1990a; cf. Friedman 2009). These 
tools should assume subservient roles because 
the clinician caring for the patient knows and 
understands the patient’s situation and can 
make compassionate judgments better than 
computer programs. Furthermore, clinicians, 
and not machine algorithms, are the entities 
which the state licenses, and specialty boards 
accredit, to practice medicine, surgery, nurs- 
ing, pharmacy, and other health-related 
activities. 

Corollaries of the standard view are that 
(1) practitioners have an obligation to use any 
computer-based tool responsibly, through 
adequate user training and by developing an 
understanding of the system’s abilities and 
limitations; and (2) practitioners must not 
abrogate their clinical judgment reflexively 
when using computer-based decision aids. 

The skills required for diagnosis are in 
many respects different from those required 
for the acquisition, storage, and retrieval of 
laboratory data. There is no contradiction in 
urging extensive use of efficient, non-burden- 
some laboratory information systems, and, 
for the time being, cautious deployment of 
expert diagnostic decision-support tools (i.e., 
not permitting their use in settings in which 
knowledgeable clinicians cannot immediately 
override faulty advice). Nevertheless, U.S. 
policy under the HITECH act of 2009 (as dis- 
cussed in » Chap. 29), led to widespread 
adoption of less-than ideal electronic health 
record systems. In many settings, those sys- 
tems engendered less efficient, burdensome, 
error-prone care delivery and physician burn- 
out (cf. Halamka and Tripathi 2017). 

More over, the standard view addresses a 
key aspect of the question, “How and when 
should computers be used in clinical prac- 
tice?” by capturing important moral intuitions 
about error avoidance and evolving standards. 
Error avoidance and the benefits that follow 
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from it shape the obligations of practitioners. 
In computer-software use, as in all other areas 
of clinical practice, good intentions alone are 
insufficient to insulate recklessness from cul- 
pability. Thus, the standard view may be seen 
as a tool for both error avoidance and ethi- 
cally optimized action. 

Ethical software use, then, should be eval- 
uated against a broad background of evidence 
for actions that produce favorable outcomes. 
Because informatics is a science in ongoing 
ferment, system improvements and evidence 
of such improvements are constantly emerg- 
ing. Clinicians have an obligation to be famil- 
iar with this evidence after attaining minimal 
acceptable levels of familiarity with informat- 
ics in general and with the clinical systems 
they use in particular (@ Fig. 12.1). 


12.2.2 Appropriate Users 
and Educational Standards 


Efficient and effective use of health care infor- 
matics systems requires prior system evalua- 
tions demonstrating utility, education and 
training of new users, monitoring of experi- 
ence, and appropriate, timely updating. 
Indeed, such requirements resemble those for 
other tools used in health care and in other 
domains. Inadequate preparation in the use 
of tools is an invitation to catastrophe. When 
the stakes are high and the domain large and 
complex—as is the case in the health profes- 
sions—education and training take on moral 
significance. 

Who should use a health care-related com- 
puter application? Consider expert decision- 
support systems as an example. An early 
paper on ethical issues in informatics noted 
that potential users of such systems include 
physicians, nurses, physicians’ assistants, 
paramedical personnel, students of the health 
sciences, patients, and insurance and govern- 
ment evaluators (Miller et al. 1985a). Are 
members of all these groups appropriate 
users? One cannot answer the question until 
one precisely specifies the intended use for the 
system (i.e., the particular clinical questions 
the system will address). The appropriate level 
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O Fig. 12.1 The U.S. Department of Veterans Affairs 
in the 1970s developed the highly regarded “Veterans 
Health Information Systems and Technology Architec- 
ture” (VistA), once the largest electronic health record 
system in the United States. This fictitious screen shot 


of training must be correlated with the ques- 
tion at hand. At one end of an appropriate- 
use spectrum, we can posit that medical and 
nursing students should employ decision- 
support systems for educational purposes; 
this assertion is relatively free of controversy 
once it has been verified that such tools con- 
vey accurately a sufficient quantity and qual- 
ity of educational content. But it is less clear 
that patients, administrators, or insurance 
company gatekeepers, for example, should use 
expert decision-support systems for assistance 
in making diagnoses, in selecting therapies, or 
in evaluating the appropriateness of health 
professionals’ actions or determining their 
reimbursement. To the extent that some sys- 
tems present general medical advice in gener- 
ally understandable but sufficiently nuanced 
formats, such as once was the case with Dr. 


Image Rowe C Prs: Yiewlnlo Tools Options Layout Help 


demonstrates some of the system’s functions and utili- 
ties. (Credit: Courtesy of U.S. Department of Veterans 
Affairs, Veterans Health Administration Office of Infor- 
matics and Analytics) 


Benjamin Spock’s 1950s era print-based child- 
care primer, one might condone system use by 
laypersons. There are additional legal con- 
cerns related to negligence and product liabil- 
ity, however, when health-related products are 
sold directly to patients rather than to licensed 
practitioners, and when such products give 
patient-specific counsel rather than general 
clinical advice (Miller et al. 1985a). 

Suitable use of a software program that 
helps a user to suggest diagnoses, to select 
therapies, or to render prognoses must be 
plotted against an array of goals and best 
practices for achieving those goals, including 
consideration of the characteristics and 
requirements of individual patients. For 
example, the multiple, interconnected inferen- 
tial strategies required for arriving at an accu- 
rate diagnosis depend on knowledge of facts; 
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experience with procedures; and familiarity 

with human behavior, motivation, and values. 

Diagnosis is a process rather than an event 

(Miller 1990a), so even well-validated diag- 

nostic systems must be used appropriately in 

the overall context of patient care. 

To use a diagnostic decision-support sys- 
tem (> Chap. 24), a clinician must be able to 
recognize when the computer program has 
erred, and, when it is accurate, what the out- 
put means and how it should be interpreted. 
This ability requires knowledge of both the 
diagnostic sciences and the software applica- 
tions, and the strengths and limitations of 
each. After assigning a diagnostic label, the 
clinician must communicate the diagnosis, 
prognosis, and implications to a patient, and 
must do so in ways both appropriate to the 
patient’s educational background and condu- 
cive to future treatment goals. It is not 
enough to be able to tell patients that they 
have cancer, human immunodeficiency virus 
(HIV), diabetes, or heart disease and then 
simply hand over a prescription. The care 
provider must also offer context when avail- 
able, comfort when needed, and hope as 
appropriate. For instance, the reason many 
organizations have required counseling both 
before and after HIV and genetic testing is 
not to vex busy health professionals but to 
ensure that comprehensive, high-quality care, 
rather than mere diagnostic labeling, is deliv- 
ered. 

This discussion points to the following set 
of ethical principles for appropriate use of 
decision-support systems: 

1. A computer program should be used in 
clinical practice only after appropriate 
evaluation of its efficacy and the docu- 
mentation that it performs its intended 
task at an acceptable cost in time and 
money. 

2. Users of most clinical systems should be 
health professionals who are qualified to 
address the question at hand on the basis 
of their licensure, clinical training, and 
experience. Software systems should be 
used to augment or supplement, rather 
than to replace or supplant, such individu- 
als’ decision making. 
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3. All uses of informatics tools, especially 
inpatient care, should be preceded by ade- 
quate training and instruction, which 
should include review of applicable prod- 
uct evaluations. 


Such principles and claims should be viewed 
as analogous to other standards or rules in 
clinical medicine and nursing. 


12.2.3 Obligations and Standards 
for System Developers 
and Maintainers 


Users of clinical programs must rely on the 
work of other people who are often far 
removed from the context of use. As with all 
complex technologies, users depend on the 
developers and maintainers of a system and 
must trust evaluators who have validated a 
system for clinical use. Health care software 
applications are among the most complex 
tools in the technological armamentarium. 
Although this complexity imposes certain 
obligations on end users, it also commits a 
system’s developers, designers, and maintain- 
ers to adhere to reasonable standards and, 
indeed, to acknowledge their moral responsi- 
bility for doing so. 


12.2.3.1 Ethics, Standards, 
and Scientific Progress 

The very idea of a standard of care embodies 
a number of complex assumptions linking 
ethics, evidence, outcomes, and professional 
training. To say that nurses or physicians must 
adhere to a standard is to say, in part, that 
they ought not stray from procedures previ- 
ously shown or generally believed to work bet- 
ter than alternatives. The difficulty lies in how 
to determine if a procedure or device “works 
better” than another. Such determinations in 
the health sciences constitute progress, and 
provide evidence that we now know more. 
Criteria for weighing such evidence, albeit 
short of proof in most cases, are applied. For 
example, evidence from well-designed ran- 
domized controlled trials merits greater trust 
than evidence derived from uncontrolled ret- 
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rospective studies (see > Chap. 13). Typically, 
verification by independent investigators must 
occur before placing the most recent study 
results into common practice. 

People who develop, maintain, and sell 
health care computing systems and their com- 
ponents have obligations that parallel those of 
system users. These obligations include hold- 
ing patient care as the foremost value. The 
duty to limit or prevent harm to patients 
applies to system developers as well as to 
practitioners. Although this principle is easy 
to suggest and, generally, to defend, it invites 
subtle, and sometimes overt, resistance from 
people for whom profit or fame are primary 
motivators. (This is of course also true for 
other medical devices, processes and indus- 
tries.) To be sure, quests for fame and fortune 
often produce good outcomes and improved 
care, at least eventually. Even so, some 
approaches fail to take into account the role 
of intention as a moral criterion (cf. Goodman 
et al. 2010). 

In medicine, nursing, and psychology, a 
number of models of the professional-patient 
relationship place trust and advocacy at the 
apex of a hierarchy of values. Such a stance 
cannot be maintained if goals and intentions 
other than patient well-being are (generally) 
assigned primacy. The same principles apply 
to those who produce and attend to health 
care information systems. Because these sys- 
tems are health care systems—and are not 
devices for accounting, entertainment, real 
estate, and so on—and because system under- 
performance can cause pain, disability, illness, 
and death, it is essential that the threads of 
trust run throughout the fabric of clinical sys- 
tem design and maintenance. 

System purchasers, users, and patients 
must rely upon developers and maintainers to 
recognize the potentially grave consequences 
of errors or carelessness, trust them to care 
about the uses to which the systems will be 
put, and rely upon them to value the reduced 
suffering of other people at least as much as 
they value their own personal gain. This reli- 
ance emphatically does not entail that system 
designers and maintainers are blameworthy or 
unethical if they hope and strive to profit from 


their diligence, creativity, and effort. Rather, it 
implies that no amount of financial benefit for 
a designer or builder can counterbalance bad 
outcomes or ill consequences that result from 
recklessness, avarice, or inattention to the 
needs of clinicians and their patients. 
Purchasers and users should require demon- 
strations that systems are worthy of such trust 
and reliance before placing patients at risk, 
and that safeguards (human and mechanical) 
are in place to detect, alert, and rectify situa- 
tions in which systems underperform. 

Quality standards should stimulate scien- 
tific progress and innovation while safeguard- 
ing against system error and abuse. These goals 
might seem incompatible, but they are not. Let 
us postulate a standard that requires timely 
updating and testing of knowledge bases that 
are used by decision-support systems. To the 
extent that database accuracy is needed to 
maximize the accuracy of inferential engines, 
it is trivially clear how such a standard will 
help to prevent or reduce decision-support 
mistakes. Furthermore, the standard should be 
seen to foster progress and innovation in the 
same way that any insistence on best possible 
accuracy helps to protect scientists and clini- 
cians from pursuing false leads, or wasting 
time in testing poorly wrought hypotheses. It 
will not do for database maintainers to insist 
that they are busy doing the more productive 
or scientifically stimulating work of improving 
knowledge representation, say, or database 
design. Although such tasks are important, 
they do not supplant the tasks of updating and 
testing tools in their current configuration or 
structure. Put differently, scientific and techni- 
cal standards are perfectly able to stimulate 
progress while taking a cautious or even con- 
servative stance toward permissible risk in 
patient care. 

This approach has been described as pro- 
gressive caution: “Medical informatics is, hap- 
pily, here to stay, but users and society have 
extensive responsibilities to ensure that we use 
our tools appropriately. This might cause us 
to move more deliberately or slowly than 
some would like” (Goodman 1998). 

A more recent concern, with both ethical 
and legal implications, is the responsibility of 
software developers to design and implement 
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software programs that cannot easily be 
hacked by malicious code writers. This con- 
cern goes beyond privacy and confidentiality 
issues (discussed below), and includes the pos- 
sibility that medical devices with embedded 
software might be nefariously “repro- 
grammed” in a manner that might cause harm 
to patients (see, for example, Pugh et al. 2018 
and Sackner-Bernstein 2017). A more detailed 
discussion of this topic appears under the 
> Sect. 12.5 below. 


12.2.3.2 System Evaluation 
as an Ethical Imperative 

Any move toward “best practices” in biomedi- 
cal informatics is shallow and feckless if it 
does not include a way to measure whether a 
system performs as intended. This and related 
measurements provide the ground for quality 
control and, as such, are the obligations of 
system developers, maintainers, users, admin- 
istrators, and perhaps other players (see 
> Chap. 13). 


» Medical computing is not merely about 
medicine or computing. It is about the 
introduction of new tools into environments 
with established social norms and practices. 
The effects of computing systems in health 
care are subject to analysis not only of accu- 
racy and performance but of acceptance by 
users, of consequences for social and profes- 
sional interaction, and of the context of use. 
We suggest that system evaluation can illu- 
minate social and ethical issues in medical 
computing, and in so doing improve patient 
care. That being the case, there is an ethical 
imperative for such evaluation (Anderson 
and Aydin 1998). 


To give a flavor of how a comprehensive 
evaluation program can ethically optimize 
implementation and use of an informatics 
system, consider these ten criteria for system 
scrutiny (Anderson and Aydin 1994): 

1. Does the system work as designed? 

2. Is it used as anticipated? 

3. Does it produce the desired results? 

4. Does it work better than the procedures it 
replaced? 

5. Is it cost effective? 
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6. How well have individuals been trained to 
use it? 

7. What are the anticipated long-term effects 
on how organizational units interact? 

8. What are the long-term effects on the 
delivery of medical care? 

9. Will the system have an impact on control 
in the organization? 

10. To what extent do effects depend on prac- 

tice setting? 


Another way to make this important point is 
by emphasizing that people use computer sys- 
tems. Even the finest system might be misused, 
misunderstood, or mistakenly allowed to alter 
or erode previously productive human rela- 
tionships. Evaluation of health information 
systems in their contexts of use should be taken 
as amoral imperative. Such evaluations require 
consideration of a broader conceptualization 
of “what works best” and must look toward 
improving the overall health care delivery sys- 
tem rather than only that system’s technologi- 
cally based components. These higher goals 
entail the creation of a corresponding mecha- 
nism for ensuring institutional oversight and 
responsibility (Miller and Gardner 1997a, b). 


12.3 Privacy, Confidentiality, 
and Data Sharing 


Some of the greatest challenges of the 
Information Age arise from placing computer 
applications in health care settings while 
upholding traditional principles and values. 
One challenge involves balancing two com- 
peting values: (1) free access to information, 
and (2) protection of patients’ privacy and 
confidentiality. 

Only computers can efficiently manage the 
now-vast amount of information generated 
during clinical encounters and other health 
care transactions (see > Chap. 2); at least in 
principle, such information should be easily 
available to health professionals and others 
involved in the administration of the care- 
delivery system, so that they can provide effec- 
tive, efficient care for patients. Yet, making 
this information readily available creates 
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greater opportunities for inappropriate access. 
Such access may be available to curious health 
care workers who do not need the information 
to fulfill job-related responsibilities, and, even 
more worrisome, to other people who might 
use the information to harm patients physi- 
cally, emotionally, or financially. Clinical sys- 
tem administrators must balance the goals of 
protecting confidentiality by restricting use of 
computer systems and improving care by 
assuring the integrity and availability of data. 
These objectives are not incompatible, but 
there are trade-offs that cannot be avoided. 


Foundations of Health 
Privacy and Confidentiality 


12.3.1 


Privacy and confidentiality are necessary for 
people to evolve and mature as individuals, to 
form relationships, and to serve as function- 
ing members of society. Imagine what would 
happen if the local newspaper or gossip blog 
produced a daily report detailing everyone’s 
actions, meetings, and conversations. It is not 
that most people have terrible secrets to hide 
but rather that the concepts of solitude, inti- 
macy, and the desire to be left alone make no 
sense without the expectation that at least 
some of our actions and utterances will be 
kept private or held in confidence among a 
limited set of persons. 

The “average” sentiment about the appro- 
priate sphere of private vs. public may vary 
considerably from culture to culture, and even 
from generation to generation within any par- 
ticular culture; and it may differ widely among 
persons within a culture or generation, and 
evolve for any particular person over a life- 
time. Even the “born digital” generation, for 
which social media are a fixture of everyday 
life, has — and ought to have -its boundaries 
(Palfrey and Gasser 2010). 

The terms privacy and confidentiality are 
not synonymous. As commonly used, “pri- 
vacy” generally applies to people, including 
their desire not to suffer eavesdropping, 
whereas “confidentiality” is best applied to 
information. One way to think of the differ- 
ence is as follows. If someone follows you and 
spies on you entering an AIDS clinic, your 


privacy is violated; if someone sneaks into the 
clinic without observing you in person and 
looks at your health care record, your record’s 
confidentiality is breached. In discussions of 
the electronic health record, the term privacy 
may also refer to individuals’ desire to restrict 
the disclosure of personal data (National 
Research Council 1997). 

There are several important reasons to pro- 
tect privacy and confidentiality. One is that pri- 
vacy and confidentiality are widely regarded as 
rights of all people, and such protections help 
to accord them respect. On this account, peo- 
ple do not need to provide a justification for 
limiting access to their identifiable health data; 
privacy and confidentiality are entitlements 
that a person does not need to earn, to argue 
for, or to defend. Another reason is more prac- 
tical: protecting privacy and confidentiality 
benefits both individuals and society. Patients 
who know that their identifiable health care 
information will not be shared inappropriately 
are more comfortable disclosing that informa- 
tion to clinicians. This trust is vital for the suc- 
cessful physician-patient, nurse-patient, or 
psychologist-patient relationship, and it helps 
practitioners to do their jobs. This insight is as 
old as the Hippocratic corpus. 

Privacy and confidentiality protections 
also benefit public health. People who fear 
disclosure of personal information are less 
likely to seek out professional assistance, 
increasing the risks that contagion will be 
spread and maladies will go untreated. In 
addition, people still suffer discrimination, 
bias, and stigma when certain health data do 
fall into the wrong hands. Financial harm 
may occur if insurers are given unlimited 
access to family members’ records, or access 
to patient data, because some insurers might 
be tempted to increase the price of insurance 
for individuals at higher risk of illness or dis- 
criminate in other ways if such price differen- 
tiation were forbidden by law. This is, in the 
United States, among the reasons the Patient 
Protection and Affordable Care Act of 2010 
(U.S. Public Law 111-148), in prohibiting 
insurers from discrimination based on “pre- 
existing conditions,” was so important — and 
why subsequent efforts on ideological grounds 
to overturn the act are dangerously erosive. 
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The ancient idea that physicians should 
hold health care information in confidence is 
therefore applicable whether the data are writ- 
ten on paper or processed in silicon. The obli- 
gations to protect privacy and to keep 
confidences fall to system designers and main- 
tainers, to administrators, and, ultimately, to 
the physicians, nurses, and others who elicit 
the information in the first place. The upshot 
for all of them is this: protection of privacy 
and confidentiality is not an option, a favor, 
or a helping hand offered to patients with 
embarrassing health problems; it is a duty, 
regardless of the malady or the medium in 
which information about it is stored. 

Some sound clinical practice and public 
health traditions run counter to the idea of 
absolute confidentiality. When a patient is hos- 
pitalized, it is expected that all appropriate 
(and no inappropriate) employees or affiliates 
of the institution—primary physicians, con- 
sultants, nurses, therapists, and technicians— 
will have access to the patient’s medical 
records, when it is in the interest of the patient’s 
care to do so. In most communities of the 
United States, the contacts of patients who 
have active tuberculosis or certain sexually 
transmitted diseases are routinely identified 
and contacted by public health officials so that 
the contacts may receive proper medical atten- 
tion. Such disclosures serve the public interest 
and are and should be legal because they 
decrease the likelihood that more widespread 
harm to other individuals might occur through 
transmission of an infection unknowingly. 

A separate but important public health 
consideration (discussed in more detail below) 
involves the ability of health care researchers 
to anonymously pool data (i.e, pool by 
removing individual persons’ identifying 
information) from patient cases that meet 
specified conditions to determine the natural 
history of the disease and the effects of vari- 
ous treatments. Examples of benefits from 
such pooled data analyses range from the 
ongoing results generated by regional collab- 
orative chemotherapy trials to the discovery, 
more than four decades ago, of the 
appropriateness of shorter lengths of stay for 
patients with myocardial infarction (McNeer 
et al. 1975). More recently, the need for robust 
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syndromic surveillance has been asserted as 
necessary for adequate bioterrorism prepared- 
ness, for earlier detection of naturally occur- 
ring disease outbreaks, and, most dramatically, 
in the Coronavirus pandemic of 2020 (Ienca 
and Vayena 2020; see also > Chap. 18). 


12.3.2 Electronic Clinical 
and Research Data 


Access to electronic patient records holds 
extraordinary promise for clinicians and for 
other people who need timely, accurate patient 
data (see > Chap. 14). Institutions that do not 
yet deploy electronic health record systems 
have fallen behind; this may become blame- 
worthy. Failure to use such systems may also 
disqualify institutions for reimbursements 
from public and private insurance, making it 
effectively an organizational death sentence. 
Conversely, systems that make it easy for clini- 
cians to access data also make it easier for 
people in general to access the data, and elec- 
tronic systems generally magnify number of 
persons whose information becomes available 
when a system security breach occurs. Some 
would consider failure to prevent inappropri- 
ate access as at least as blameworthy as failure 
to provide adequate and appropriate access. 
Nonetheless, there is no contradiction 
between the obligation to maintain a certain 
standard of care (in this case, regarding mini- 
mal levels of computer use) and ensuring that 
such a technical standard does not imperil the 
rights of patients. Threats to confidentiality 
and privacy are fairly well known. They 
include economic abuses, or discrimination by 
third-party payers, employers, and others who 
take advantage of the ever-burgeoning market 
in health data; insider abuse, or record snoop- 
ing by hospital or clinic workers who are not 
directly involved in a patient’s care but exam- 
ine a record out of curiosity, for instance; 
identity theft for insurance or other forms of 
financial fraud; and malevolent hackers, or 
people who, via networks or other means, 
copy, delete, or alter confidential informa- 
tion — or threaten to do so, a component of 
“ransomware” (see, e.g., Sittig and Singh 
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2016; and Slayton 2018). Moreover, wide- 
spread dissemination of information through- 
out the health care system often occurs 
without explicit patient consent. Health care 
providers, third-party payers, managers of 
pharmaceutical benefits programs, equipment 
suppliers, and oversight organizations collect 
large amounts of patient-identifiable health 
information for use in managing care, con- 
ducting quality and utilization reviews, pro- 
cessing claims, combating fraud, and 
analyzing markets for health products and 
services (National Research Council 1997). 

The proper approach to such challenges is 
one that will ensure both that appropriate cli- 
nicians and other people have rapid, easy 
access to patient records and that others do 
not have access. Is that a contradictory bur- 
den? No. Is it easy to achieve both? No. There 
are many ways to restrict inappropriate access 
to electronic records, but all come with a cost. 
Sometimes the cost is explicit, as when it 
comes in the form of additional security soft- 
ware and hardware; sometimes it is implicit, 
as when procedures are required that increase 
the time commitment by system users. 

A well-established standard way to view 
the landscape of protective measures is to 
divide it into technological methods and insti- 
tutional or policy approaches (Alpert 1998): 


12.3.2.1 Technological Methods 

Computer systems per se can optimize some 
aspects of security. Typical systems verify 
that users are who they claim to be (“authen- 
ticating”) with passwords, tokens or biomet- 
rics. Other controls limit access to people 
with a professional “need to know.” Creating 
audit trails, or logs, to record who viewed 
confidential records enables authorized facil- 
ity administrators, automated security audit- 
ing programs, and patients to later review 
who accessed what. Encryption can protect 
data in transit and at rest (in storage). These 
technical means are complemented by pro- 
tecting the elements of the electronic infra- 
structure with physical barriers when 
operations allow it. Auditing works best 
when appropriately severe punishments are 
widely known to be policy, and when policy 


breaches are uniformly punished in a semi- 
public manner. 

Technological efforts to improve health 
system security have emerged as a kind of 
sub-specialty in health informatics, with sys- 
tem developers, computer scientists and oth- 
ers working to improve confidentiality 
protections. This often entails both better fire- 
walls against intrusion and software to pre- 
vent re-identification of stored data with the 
individuals to whom the data apply (see, for 
example, Malin and Goodman 2018). 


12.3.2.2 Policy Approaches 


In its landmark report, the National Research 
Council (1997) recommended that hospitals 
and other health care organizations create 
security and confidentiality committees and 
establish education and training programs. 
These recommendations parallel an approach 
that had worked well elsewhere in hospitals 
for matters ranging from infection control to 
bioethics. The U.S. Health Insurance 
Portability and Accountability Act (HIPAA) 
requires the appointment of privacy and secu- 
rity officials, special policies, and the training 
of health care workforce members who have 
access to health information systems. The 
European Union’s General Data Protection 
Regulation (GDPR) requires new account- 
ability and governance measures, standards 
for access by people to data and information 
about them, and rules for use of that data and 
information. 

Such measures are all the more important 
when health data are accessible through net- 
works. The rapid growth of integrated delivery 
networks (IDNs) (see > Chap. 16) and Health 
Information Exchanges, for example, illus- 
trate the need not to view health data as a well 
into which one drops a bucket but rather as an 
irrigation system that makes its contents 
available over a broad—sometimes an 
extremely broad—area. It is not yet clear 
whether privacy and confidentiality protec- 
tions that are appropriate in hospitals will be 
fully effective in a ubiquitously networked 
environment, but it is a start. System develop- 
ers, users, and administrators are obliged to 
identify appropriate measures in light of the 
particular risks associated with a given imple- 


Ethics in Biomedical and Health Informatics: Users, Standards, and Outcomes 


mentation. There is no excuse for failing to 
make this a top priority throughout the data 
storage and sharing environment. 


12.3.2.3 Electronic Data and Human 
Subjects Research 

The use of patient information for clinical 
research and for quality assessment raises 
interesting ethical challenges. The presump- 
tion of a right to confidentiality seems to 
include the idea that patient records are inex- 
tricably linked to patient names or to other 
identifying data. In an optimal environment, 
then, patients can monitor who is looking at 
their records. But if all unique identifiers have 
been stripped from the records, is there any 
sense in talking about confidentiality? 

The benefits to public health loom large in 
considering record-based research (> Chap. 
18). A valuable benefit of the electronic health 
record is the ability to access vast numbers of 
patient records to estimate the incidence and 
prevalence of various maladies, to track the 
efficacy of clinical interventions, and to plan 
efficient resource allocation (see > Chap. 18). 
Such research and planning would, however, 
Impose onerous or intractable burdens if 
informed, or valid consent had to be obtained 
from every patient whose record was repre- 
sented in the sample. Using confidentiality to 
impede or forbid such research fails to benefit 
patients at the same time it sacrifices benefi- 
cial scientific investigations. 

A more practical course is to establish 
safeguards that better balance the ethical obli- 
gations to privacy and confidentiality against 
the social goals of public health and systemic 
efficiency. This balancing can be pursued via a 
number of paths. The first is to establish 
mechanisms to anonymize the information in 
individual records or to decouple the data 
contained in the records from any unique 
patient identifier. This task is not always 
straightforward; it can be remarkably difficult 
to anonymize data such that, when coupled 
with other data sets, the individuals are not at 
risk of re-identification. A relatively rare 
disease diagnosis coupled with demographic 
data such as age and gender, or geographic 
data such as a postal code, may act as a sur- 
rogate unique identifier; that is, detailed infor- 


403 


mation can in combination serve as a data 
fingerprint that picks out an individual patient 
even though the patient’s name, Social 
Security number, or other (official) unique 
identifiers have been removed from the record. 
Challenges and opportunities related to de- 
identifying and re-identifying data are among 
the most interesting, difficult and important 
in all health computing (Atreya et al. 2013; 
Benitez and Malin 2010; Malin and Sweeney 
2004; Malin et al. 2011; Sweeney 1997; 
Tamersoy et al. 2012). 

Such challenges point to a second means 
of balancing ethical goals in the context of 
database research: the use of institutional 
panels, such as medical record committees 
or institutional review boards. Submission of 
database research to appropriate institutional 
scrutiny is one way to make the best use of 
more or less anonymous electronic patient 
data. Competent panel members should be 
educated in the research potential of elec- 
tronic health records, as well as in ethical 
issues in epidemiology and public health. 
Scrutiny by such committees can also give 
appropriate weight to competing ethical con- 
cerns in the context of internal research for 
quality control, outcomes monitoring, and 
so on (Goodman 1998; Miller and Gardner 
1997a, b). 


12.3.2.4 Challenges in Bioinformatics 
Safeguards are increasingly likely to be chal- 
lenged as genetic information makes its way 
into the health care record (see > Chaps. 11 
and 28). The risks of bias, discrimination, and 
social stigma increase dramatically as genetic 
data become available to clinicians and inves- 
tigators. Indeed, genetic information “goes 
beyond the ordinary varieties of medical 
information in its predictive value” (Macklin 
1992). Genetic data also may be valuable to 
people predicting outcomes, allocating 
resources, and the like. In addition, genetic 
data are rarely associated with only a single 
person; they may provide information about 
relatives, including relatives who do not want 
to know about their genetic risk factors or 
potential maladies, as well as relatives who 
would love dearly to know more about their 
kin’s genome. There is still much work to be 
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done in sorting out and addressing the ethical 
issues related to electronic storage, sharing, 
and retrieval of genetic data (Goodman 1996, 
2016a). 

Bioinformatics or computational biology 
provides an exciting ensemble of new tools to 
increase our knowledge of genetics, genetic 
diseases, and public health. Use of these tools 
is accompanied by responsibilities to attend to 
the ethical issues raised by new methods, 
applications, and consequences (Goodman 
and Cava 2008). Identifying and analyzing 
these issues are among the key tasks of those 
who work at the intersection of ethics and 
health information technology. The future of 
genetics and genomics is utterly computa- 
tional, with data storage and analysis posing 
some of greatest financial and scientific chal- 
lenges. For instance: 
= How, to what extent, and by whom should 

genomic databases be used for clinical or 

public health decision support? 

= Are special rules needed to govern the 
study of information in digital genetic 
repositories (or are current human sub- 
jects research protection rules adequate)? 

= Does data mining software present new 
challenges when applied to human genetic 
information? 

= What policies are required to guide and 
inform the communication of patient- 
specific and incidental findings? 

= Are special protections and precautions 
needed to address and transmit findings 
about population subgroups? 


It might be that the tools and uses of compu- 
tational biology will eventually offer ethical 
challenges—and opportunities—as impor- 
tant, interesting and compelling as any tech- 
nology in the history of the health sciences. 
Significantly, this underscores the importance 
of arguments to the effect that attention to 
ethics must accompany attention to science. 
Victories of health science research and devel- 
opment will be undermined by any failures to 
address corresponding ethical challenges. We 
must strive to identify, analyze, and resolve or 
mitigate important ethical issues. 


12.4 Social Challenges and Ethical 
Obligations 


The expansion of evidence-based medicine 
and, in the United States, of managed care 
(now sometimes called accountable care since 
the passage of health reform legislation in 
2010; see > Chap. 29) places a high premium 
on the tools of health informatics. The need 
for data on clinical outcomes is driven by a 
number of important social and scientific fac- 
tors. Perhaps the most important among these 
factors is the increasing unwillingness of gov- 
ernments and insurers to pay for interventions 
and therapies that do not work or that do not 
work well enough to justify their cost. 

Health informatics helps clinicians, admin- 
istrators, third-party payers, governments, 
researchers, and other parties to collect, store, 
retrieve, analyze, and scrutinize vast amounts 
of data—though the task of documenting this 
is itself a matter of research on what has come 
to be called “meaningful use.” The functions 
of health informatics might be undertaken 
not for the sake of any individual patient but 
rather for cost analysis and review, quality 
assessment, scientific research, and so forth. 
These functions are critical, and if computers 
can improve their quality or accuracy, then so 
much the better. 

Challenges arise when intelligent applica- 
tions are mistaken for decision-making sur- 
rogates or when institutional or public policy 
recommends or favors computer output over 
human cognition. This may be seen as a 
question or issue arising under the rubric of 
“appropriate uses and users.” That is, by 
whom, when, and under what constraints 
may we elicit and invoke computational 
analysis in shaping or applying public policy? 
The question whether an individual physi- 
cian or multispecialty group, say, should be 
hired or retained or reimbursed or rewarded 
is information-intensive. The question that 
follows, however, is the key one: How should 
the decision-making skills of human and 
machine be used, and balanced (cf. Glaser 
2010)? 
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12.4.1 Vendor Interactions 


Motivated if not inspired by both technologi- 
cal necessity and financial opportunity, hum- 
ble private practices and sprawling medical 
centers have—or should have—begun the 
transition from a paper patient record to an 
electronic one. The need to make such a tran- 
sition is not in dispute: paper (and handwrit- 
ing) are hard to store, find, read and analyze. 
Electronic Health Records (EHR) are not, or 
should not be. While there are important 
debates about the speed of the transition and 
regarding software quality, usability and abil- 
ity to protect patient safety, it is widely agreed 
that the recording and storage of health infor- 
mation must be electronic. 

Public policy has attempted to overcome 
some of the reluctance to make the change 
because of financial concerns. Notably, the 
U.S. Health Information Technology for 
Economic and Clinical Health (HITECH) 
Act, a part of the American Recovery and 
Reinvestment Act of 2009 (Blumenthal 2010), 
authorized some $27 billion in incentives for 
EHR adoption. These incentives helped 
address but did not eliminate financial con- 
cerns in that they offset only some of the cost 
of converting to an e-system. Still, while a 
number of companies had previously found 
opportunity in developing hospital and other 
clinical information systems, HITECH accel- 
erated the pace (see > Chap. 29). 

The firms that make and sell EHRs are not 
regulated in the same way as those that manu- 
facture pharmaceutical products or medical 
devices (see > Sect. 12.5.3). In an increasingly 
competitive environment, this has led to con- 
troversy about the nature of vendor interac- 
tions with the institutions that buy their 
products. An EHR system for a mid-sized 
hospital can cost upwards of $100 million 
over time, including consulting services, hard- 
ware and training. It follows that it is reason- 
able to ask what values should guide such 
vendor interactions with clients, and whether 
they should be similar to or different from val- 
ues that govern other free-market dealings. 
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While many or most contracts between 
vendors and hospitals are confidential, it has 
been reported that some HIT vendors require 
contract language that indemnifies system 
developers for personal injury claims or mal- 
practice, even if the vendor is at fault; some 
vendors require system purchasers to agree 
not to disclose system errors except to the ven- 
dor (Koppel and Kreda 2009). Such provi- 
sions elicit concern to the extent they place or 
appear to place corporate interests ahead of 
patient safety and welfare. In this case, a 
working group chartered by AMIA, the soci- 
ety for informatics professionals (see > Chap. 
1), issued a report that provided guidance on 
a number of vendor interaction issues 
(Goodman et al. 2010). Importantly, the 
working group comprised industry represen- 
tatives as well as scientists and other academ- 
ics. The group’s recommendations included 
these: 
= Contracts should not contain language 

that prevents system users, including clini- 

cians and others, from using their best 
judgment about what actions are neces- 
sary to protect patient safety. This includes 
freedom to disclose system errors or flaws, 
whether introduced or caused by the ven- 
dor, the client, or a third party. Disclosures 
made in good faith should not constitute 
violations of HIT contracts. This recom- 
mendation neither entails nor requires the 
disclosure of trade secrets or of intellec- 
tual property. 

= Because vendors and their customers share 
responsibility for patient safety, contract 
provisions should not attempt to circum- 
vent fault and should recognize that both 
vendors and purchasers share responsibil- 
ity for successful implementation. For 
example, vendors should not be absolved 
from harm resulting from system defects, 
poor design or usability, or hard-to-detect 
errors. Similarly, purchasers should not be 
absolved from harm resulting from inade- 
quate training and education, inadequate 
resourcing, customization, or inappropri- 
ate use. 
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While some of the debates that led to those 
conclusions were about political economy 
(regulation vs. free enterprise) as much as eth- 
ics (right vs. wrong), the opportunity for rap- 
prochement in the service of a patient-centered 
approach may be seen as an affirmation of the 
utility of an applied ethics process in the evo- 
lution of health information technology. 


12.4.2 Computational Prognosis 


Consider the utility of prognostic scoring sys- 
tems that use physiologic and mortality data 
to compare new critical-care patients with 
thousands of previous patients (Knaus et al. 
1991). Such systems allow hospitals to track 
the performance of their critical-care units by, 
say, comparing the previous year’s outcomes 
to this year’s or by comparing one hospital to 
another. If, for instance, patients with a par- 
ticular profile tend to survive longer than their 
predecessors, then it might be inferred that 
critical care has improved. Such scoring sys- 
tems can be useful for internal research and 
for quality management (B Fig. 12.2). 


Now suppose that most previous patients 
with a particular physiologic profile have died 
in critical-care units; this information might 
be used to identify ways to improve care of 
such patients—or it might be used in support 
of arguments to contain costs by denying care 
to subsequent patients fitting the profile (since 
they are likely to die anyway). 

An argument in support of such an appli- 
cation might be that decisions to withdraw or 
withhold care are often and customarily made 
on the basis of subjective and fragmented evi- 
dence; so it is preferable to make such deci- 
sions on the basis of objective data of the sort 
that otherwise underlie sound clinical practice. 
Such outcomes data are precisely what fuels 
the engines of managed care, wherein health 
professionals and institutions compete on 
the basis of cost and outcomes. Why should 
society, or a managed-care organization, or 
an insurance company pay for critical care 
when seemingly objective evidence exists that 
such care will not be efficacious? Contrarily, 
consider the effect on future scientific insights 
of denying care to such patients. Scientific 
progress is often made by noticing that cer- 


O Fig. 12.2 “Severity adjusted daily data” in fictitious 
APACHE® Outcomes screen shot. Using prognostic 
scoring systems, clinicians in critical-care units can 
monitor events and interventions and administrators 
can manage staffing based on patient acuity. Clinicians 
can also use such systems to predict mortality, raising a 


number of ethical issues. This image shows 10 CCU 
patients. For the second one in the leftmost column, for 
instance, the “acute physiology score” is 128; the risk of 
hospital mortality is 96% and the risk of ICU mortality 
is 92%. (Credit: Courtesy of Cerner Corporation, with 
permission) 
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tain patients do better under certain circum- 

stances, and investigation of such phenomena 

leads to better treatments. If all patients meet- 
ing certain criteria were denied therapy on the 
basis of a predictive tool, it would become 

a self-fulfilling prophecy for a much longer 

time that all such patients would not do well 

(Miller 1997). 

Now consider use of a decision-support 
system to evaluate, review, or challenge deci- 
sions by human clinicians; indeed, imagine an 
insurance company using a diagnostic expert 
system to determine whether a physician 
should be reimbursed for a particular proce- 
dure. If the expert system has a track record 
for accuracy and reliability, and if the system 
“disagrees” with the human’s diagnosis or 
treatment plan, then the insurance company 
can contend that reimbursement for the pro- 
cedure would be a mistake. Why pay for a pro- 
cedure that is not indicated, at least according 
to a computational analysis? 

In the two examples just offered (a prog- 
nostic scoring system is used to justify termi- 
nation of treatment to conserve resources, 
and a diagnostic expert system is used to deny 
a physician reimbursement for procedures 
deemed inappropriate), there seems to be jus- 
tification for adhering to the computer out- 
put. There are, however, three reasons why it is 
problematic to rely exclusively on clinical 
computer programs to guide policy or prac- 
tice in these ways: 

1. As we argued earlier with the standard 
view of computational diagnosis (and, by 
easy extension, prognosis), human cogni- 
tion is, at least for a while longer, still supe- 
rior to machine intelligence. Moreover, the 
act of rendering a diagnosis or prognosis is 
not merely a statistical or computational 
operation performed on uninterpreted 
data. Rather, identifying a malady and 
predicting its course requires understand- 
ing a complex ensemble of causal rela- 
tions, interactions among a large number 
of variables, and having a store of salient 
background knowledge—considerations 
that have thus far failed to be grasped, 
assessed, and effectively blended into deci- 
sions made by computer programs. 
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2. Decisions about whether to treat a given 
patient are often value laden and must be 
made relative to treatment goals. In other 
words, it might be that a treatment will 
improve the quality of life but not extend 
life, or vice versa (Youngner 1988). Whether 
such treatment is appropriate cannot be 
determined scientifically or statistically 
(Brody 1989). The decisions ultimately 
depend on human preferences—those of 
the provider or, even more importantly, the 
patient. 

3. Applying computational operations on 
aggregate data to individual patients runs 
the risk of including individuals in groups 
they resemble but to which they do not 
actually belong. Of course, human clini- 
cians run this risk all the time—the chal- 
lenge of inferring correctly that an 
individual is a member of a set, group, or 
class is one of the oldest problems in logic 
and in the philosophy of science. The point 
is that computers have not solved this 
problem, yet, and allowing policy to be 
guided by simple or unanalyzed correla- 
tions constitutes a conceptual error. 


The idea is not that diagnostic or prognostic 
computers are always wrong—we know that 
they are not—but rather there are numerous 
instances in which we do not know whether 
they are right. It is one thing to allow aggre- 
gate data to guide policy; doing so is just using 
scientific evidence to maximize good out- 
comes. But it is altogether different to require 
that a policy disallow individual clinical judg- 
ment and expertise. 

Informatics can contribute in many ways 
to health care reform. Indeed, computer- 
based tools can help to illuminate ways to 
reduce costs, to optimize clinical outcomes, 
and to improve care. Scientific research, qual- 
ity assessment, and the like are, for the most 
part, no longer possible without computers. 
But it does not follow that the insights from 
such research apply in all instances to the 
myriad variety of actual clinical cases at which 
competent human clinicians excel. 

The Coronavirus crisis of 2020 provided 
an opportunity to assess and review the use of 
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prognostic scoring systems to inform or guide 
triage and rationing. Controversy inherent in 
resource allocation under conditions of scar- 
city was magnified when decisions about ven- 
tilator allocation, for instance, were made 
based on a prognostic score rendered by utili- 
ties resident in electronic health records 
(Truog et al. 2020). Although a strong case 
can be made that such use was permissible 
during a crisis and in the absence of anything 
better, an equally strong case must be made 
that this opportunity to assess and review the 
use of prognostic scoring systems not be 
squandered. The in situ use of this informatics 
tool should be scrutinized and studied. 


12.4.3 Effects of Informatics 
on Traditional Relationships 


Patients are often frightened and vulnerable. 
Treating illness, easing fear, and respecting 
vulnerability are among the core obligations 
of physicians, nurses, and other clinicians. 
Health informatics has the potential to com- 
plement these traditional duties and the rela- 
tionships that they entail. We have pointed 
out that medical decisions are shaped by non- 
scientific considerations. This point is impor- 
tant when we assess the effects of informatics 
on human relationships. Thus: 


» The practice of medicine or nursing is not 
exclusively and clearly scientific, statistical, 
or procedural, and hence is not, so far, com- 
putationally tractable. This is not to make a 
hoary appeal to the “art and science” of 
medicine; it is to say that the science is in 
many contexts inadequate or inapplicable: 
Many clinical decisions are not exclusively 
medical—they have social, personal, ethi- 
cal, psychological, financial, familial, legal, 
and other components; even art might play 
a role. (Miller and Goodman 1998) 


12.4.3.1 Professional-Patient 
Relationships 

If computers, databases, and networks can 

improve physician-patient or nurse-patient 

relationships, perhaps by improving commu- 

nication, then we shall have achieved a happy 


result. If reliance on computers impedes the 
abilities of health professionals to establish 
trust and to communicate compassionately, 
however, or further contributes to the dehu- 
manization of patients (Shortliffe 1993, 1994), 
then we may have paid too dearly for our use 
of these machines. 

Suppose that a physician uses a decision- 
support system to test a diagnostic hypothesis 
or to generate differential diagnoses, and sup- 
pose further that a decision to order a particu- 
lar test or treatment is based on that system’s 
output. A physician who is not able to articu- 
late the proper role of computational support 
in his decision to treat or test will risk alienat- 
ing those patients who, for one reason or 
another, will be disappointed, angered, or 
confused by the use of computers in their 
care. To be sure, the physician might just with- 
hold this information from patients, but such 
deception carries its own threats to trust in the 
relationship. 

Patients are not completely ignorant about 
the processes that constitute human decision 
making. What they do understand, however, 
may be subverted when their doctors and 
nurses use machines to assist delicate cogni- 
tive functions. We must ask whether patients 
should be told the accuracy rate of decision 
support automata— when they have yet to be 
given comparable data for humans. Would 
such knowledge improve the informed- 
consent process, or would it “constitute 
another befuddling ratio that inspires doubt 
more than it informs rationality?” (Miller and 
Goodman 1998). 

To raise such questions is consistent with 
promoting the responsible use of computers 
in clinical practice. The question whether 
computer use will alienate patients is an 
empirical one; it is a question for which, 
despite many initial studies, we lack conclu- 
sive data to answer. (For example, we cannot 
yet state definitively whether all categories of 
patients will respond well to all specific types 
of e-mail messages from their doctors. 
Nevertheless, as a moral principle discussed 
above, one should not convey a new diagnosis 
of a malignancy via email.) To address the 
question now anticipates potential future 
problems. We must ensure that the exciting 
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potential of health informatics is not sub- 
verted by our forgetting that the practice of 
medicine, nursing, and allied professions is 
deeply human and fundamentally intimate 
and personal. 


12.4.3.2 Consumer Health 
Informatics 
The growth of the World Wide Web and the 
commensurate evolution of clinical and 
health resources on the Internet also raise 
issues for professional-patient relationships. 
Consumer health informatics—technologies 
focused on patients as the primary users— 
makes vast amounts of information available 
to patients (see » Chap. 11). There is also, 
however, misinformation—even outright 
falsehoods and quackery—posted on some 
sites. If physicians and nurses have not estab- 
lished relationships based on trust, the erosive 
potential of apparently authoritative Internet 
resources can be great. Physicians once accus- 
tomed to newspaper-inspired patient requests 
for drugs and treatments now face ever- 
increasing demands that are informed by Web 
browsing. Consequently, the following issues 
have gained in ethical importance for more 
than a decade: 
= Peer review: How and by whom is the 
quality of a Web site to be evaluated? Who 
is responsible for the accuracy of informa- 
tion communicated to patients? 
= Online consultations: There is no standard 
of care yet for online medical consulta- 
tions. What risks do physicians and nurses 
run by giving advice to patients whom they 
have not met or examined in person? This 
question is especially important in the con- 
text of telemedicine or remote-presence 
health care, the use of video teleconferenc- 
ing, image transmission, and other tech- 
nologies that allow clinicians to evaluate 
and treat patients in other than face-to- 
face situations (see ® Chap. 20). Use of 
telehealth tools became ubiquitous during 
the Coronavirus pandemic of 2020. This 
too presents an opportunity to evaluate 
widespread adoption in context. 
= Support groups: Internet support groups 
can provide succor and advice to the sick, 
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but there is a chance that someone who 
might benefit from seeing a physician will 
not do so because of anecdotes and infor- 
mation otherwise attained. How should 
this problem be addressed? 


That a resource is touted as worthwhile does 
not make it so. We lack evidence to illuminate 
the utility of consumer health informatics and 
its effects on professional-patient relation- 
ships. Such resources cannot be ignored given 
their ubiquity, and they often are useful for 
improving health. But we insist that here—as 
with decision support, appropriate use and 
users, evaluation, and privacy and confidenti- 
ality—there is an ethical imperative to pro- 
ceed with caution. Informatics, like other 
health technologies, will thrive if our enthusi- 
asm is open to greater evidence and is wed to 
deep reflection on human values. 


12.4.3.3 Personal Health Records 


At the same time as institutions have moved 
to computer-based health records systems, the 
tools available to individuals to keep their 
own health records have been making a simi- 
lar transition. Electronic personal health 
record (PHR) systems, whether designed for 
use on a decoupled storage device or accessi- 
ble over the Web, are now available from a 
rapidly expanding set of organizations (see 
> Chap. 11) (@ Figs. 12.3 and 12.4). Indeed, 
increasingly many patients access aspects of 
their health and medical records through 
“portals” established by EHR vendors. 

PHRs provide a storage base for data once 
kept on paper (or in the patient’s head) and 
repeatedly extracted with each institutional 
encounter for inclusion in that entity’s records 
system, typically: 
= Allergies, current medications 
= Current health status and major health 

issues (if any) 
= Major past health episodes and the condi- 

tion of oneself and (sometimes) relatives 
= Vaccinations, surgeries and other 
treatments 


All these data can be kept on something sim- 
ple (and un-networked) like a flash drive. It is 
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becoming more common to store the data on 
a Web site, where PHR data can also be linked 
to other health information relevant to the 
person. The PHR data can also be linked via 
a portal to a health care provider institution’s 
records, to allow updating in both directions, 
or be free of any such tie. A flash drive can be 
forgotten or lost, whereas a Web site can be 
centrally updated and uniformly available via 
any properly authenticated device on the 
Internet. 


O Fig. 12.3 Project HealthDesign (Brennan et al. 
2010) was a landmark program sponsored by the Robert 
Wood Johnson Foundation’s Pioneer Portfolio and 
intended to foster development of personal health 
records. Here is a barcode scanner that recognizes medi- 
cation labels. Designed by researchers at the University 
of Colorado at Denver, the “Colorado Care Tablet” 
allows elderly users to track prescriptions with such 
scanners and portable touch-screen tablets. (Credit: 
Courtesy of Project HealthDesign; Creative Commons 
Attribution 3.0 Unported License) 


O Fig. 12.4 A portable blood 
glucose communicator is part of the 
personal health record system 
developed by the T.R.U.E. Research 
Foundation of Washington, DC. The 
diabetes management application 
analyzes, summarizes, displays and 
makes individualized 
recommendations on nutritional 
data, physical activity data, 
prescribed medications, continuous 
blood glucose data, and self-reported 
emotional state. (Credit: Courtesy of 
Project HealthDesign; Creative 
Commons Attribution 3.0 Unported 
License) 


Traditional insurers and health care pro- 
viders are duty-bound by privacy laws and 
regulations to protect the information under 
their control. PHRs have a somewhat shakier 
set of protections given their relatively short 
history. The legal obligations of institutions 
that provide PHRs, but do not fully manage 
the content of those records nor their use, as 
well as the obligations (if any) of the individu- 
als who “manage” their own health records, 
remain to be resolved (Cushman et al. 2010). 

PHRs are now commonly linked to so- 
called “personal health applications” (PHAs) 
which provide ways of moving beyond simple 
static storage of one’s medical history. Most 
provide some sort of primitive decision sup- 
port, if only in linking to additional informa- 
tion about a particular disease or condition. 
Others include more ambitious decision- 
support functionality. All the concerns about 
the accuracy of Web-based information recur 
in this context, with concerns about the reli- 
ability of decision support added to that. 
Compounding concerns about accuracy are 
the inherent limitations of the “owner- 
operator”: If it can be difficult for trained 
health care providers to evaluate the quality 
of advice rendered by a decision support sys- 
tem, the challenges for patients will be com- 
mensurately greater. 

Traditional health care institutions may 
see the PHR as a device for patient empower- 
ment because it adds a way for persons to keep 
track of their own data; but they can also be 
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used as a way of preserving “loyalty” to a par- 
ticular institution in the health care system. It 
has been proposed that PHRs be subject to 
standards allowing “interoperability”—in this 
case, easy movement from one type of PHR to 
another—to prevent leveraging it as an imped- 
iment to patients’ movements when they wish 
to change providers or other preferences 
change. Whether such standards will evolve 
enough to make it easy to move from one 
PHR to another remains to be seen, given eco- 
nomic incentives to impede patient movement 
(just in case a patient is financially desirable 
because of insurance status or personal 
wealth). 

Whether PHRs will reach the majority of 
patients is uncertain. For persons who must 
chronically manage complex treatment regi- 
mens for themselves or for dependents, PHRs 
and their associated applications may be com- 
pelling. Persons who deal with less complex or 
transient conditions may prefer to leave 
records management to their providers. In the 
context of health care, PHRs have the poten- 
tial to replicate the “digital divide,” exacerbat- 
ing rather than reducing health disparities. 
Persons with higher levels of income and edu- 
cation may differentially benefit from PHRs 
by more readily making fuller use of them. In 
the absence of robust policy protections, some 
minors may be reluctant to use PHRs as long 
as parents or guardians retain access. 


12.5 Legal and Regulatory Matters 
The use of clinical computing systems in 


health care raises a number of interesting and 
important legal and regulatory questions. 


Difference Between Law 
and Ethics 


12.5.1 


Ethical and legal issues often overlap. Ethical 
considerations apply in attempts to determine 
what is good or meritorious and which behav- 
iors are desirable or correct in accordance 
with higher principles. Legal principles are 
generally derived from ethical ones but deal 
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with the practical regulation of morality or 
behaviors and activities. Many legal principles 
deal with the inadequacies and imperfections 
in human nature and the less-than-ideal 
behaviors of individuals or groups. Ethics 
offers conceptual tools to evaluate and guide 
moral decision making. Laws directly tell us 
how to behave (or not to behave) under vari- 
ous specific circumstances and prescribe rem- 
edies or punishments for individuals who do 
not comply with the law. Historical precedent, 
matters of definition, issues related to detect- 
ability and enforceability, and evolution of 
new circumstances affect legal practices more 
than they influence ethical requirements. 


12.5.2 Legal Issues in Biomedical 
Informatics 


Prominent legal issues related to the use of 
software applications in clinical practice and 
in biomedical research include liability under 
tort law; potential use of computer applica- 
tions as expert witnesses in the courtroom; 
legislation governing privacy and confidenti- 
ality; and copyrights, patents, and intellectual 
property issues. 


12.5.2.1 Liability Under Tort Law 

In the United States and in many other 
nations, principles of tort law govern situa- 
tions in which harm or injuries result from the 
manufacture and sale of goods and services 
(Miller et al. 1985a). Because there are few, if 
any, U.S. legal precedents directly involving 
harm or injury to patients resulting from use 
of clinical software applications (as opposed 
to a small number of well-documented 
instances where software associated with 
medical devices has caused harm), the follow- 
ing discussion is hypothetical. The principles 
involved are, however, well established with 
voluminous legal precedents outside the realm 
of clinical software. 

A key legal distinction is the difference 
between products and services. Products are 
physical objects, such as stethoscopes, that go 
through the processes of design, manufacture, 
distribution, sale, and subsequent use by pur- 
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chasers. Services are intangible activities pro- 
vided to consumers at a price by (presumably) 
qualified individuals. 

The practice of clinical medicine has been 
deemed a service through well-established 
legal precedents. On the other hand, clinical 
software applications can be viewed as either 
goods (“products”) (software programs 
designed, tested, debugged, placed on DVDs 
or other media, and distributed physically to 
purchasers) or services (applications that 
present data or provide advice to practitioners 
engaged in a service such as delivering health 
care). There are few legal precedents to deter- 
mine unequivocally how software will be 
viewed by the courts, and it is possible that 
clinical software programs will be treated as 
goods under some circumstances and as ser- 
vices under others. It might be the case that 
that software purchased and running in a pri- 
vate office to handle patient records or billing 
would be deemed a product, but the same 
software mounted on shared, centralized 
computers and accessed over the Internet 
(and billed on a monthly basis) would be 
offering a service. 

Three ideas from tort law potentially apply 
to the clinical use of software systems: 

(1) Harm by intention—when a person 
injures another using a product or service to 
cause the damage, (2) the negligence theory, 
and (3) strict product liability (Miller et al. 
1985a). Providers of goods and services are 
expected to uphold the standards of the com- 
munity in producing goods and delivering 
services. When individuals suffer harm due to 
substandard goods or services, they may sue 
the service providers or goods manufacturers 
to recover damages. Malpractice litigation in 
health care is based on negligence theory. 

Because the law views delivery of health 
care as a service (provided by clinicians), it is 
clear that negligence theory will provide the 
minimum legal standard for clinicians who 
use software during the delivery of care. 
Patients who are harmed by clinical practices 
based on imperfect software applications may 
sue the health care providers for negligence or 
malpractice, just as patients may sue attend- 
ing physicians who rely on the imperfect 
advice of a human consultant (Miller et al. 


1985a). Similarly, a patient might sue a practi- 
tioner who has not used a decision-support 
system when it can be shown that use of the 
decision-support system is part of the current 
standard of care, and that use of the program 
might have prevented the clinical error that 
occurred (Miller 1989). It is not clear whether 
the patients in such circumstances could also 
successfully sue the software manufacturers, 
as it is the responsibility of the licensed prac- 
titioner, and not of the software vendor, to 
uphold the standard of care in the community 
through exercising sound clinical judgment. 
Based on a successful malpractice suit against 
a clinician who used a clinical software sys- 
tem, it might be possible for the practitioner 
to sue the manufacturer or vendor for negli- 
gence in manufacturing a defective clinical 
software product, but cases of this sort have 
not yet been filed. If there were such suits, it 
might be difficult for a court to discriminate 
between instances of improper use of a blame- 
less system and proper use of a less-than- 
perfect system. 

In contrast to negligence, strict product 
liability applies only to harm caused by defec- 
tive products and is not applicable to services. 
The primary purpose of strict product liabil- 
ity is to compensate the injured parties rather 
than to deter or punish negligent individuals 
(Miller et al. 1985a). For strict product liabil- 
ity to apply, three conditions must be met: 

1. The product must be purchased and used 
by an individual. 

2. The purchaser must suffer physical harm 
as a result of a design or manufacturing 
defect in the product. 

3. The product must be shown in court to be 
“unreasonably dangerous” in a manner 
that is the demonstrable cause of the pur- 
chaser’s injury. 


Note that negligence theory allows for adverse 
outcomes. Even when care is delivered in a 
competent, caring, and compassionate man- 
ner, some patients with some illnesses will not 
do well. Negligence theory protects providers 
from being held responsible for all individuals 
who suffer bad outcomes. As long as the qual- 
ity of care has met the prevailing standards, a 
practitioner should not be found liable in a 
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malpractice case (Miller et al. 1985a). Strict 
product liability, on the other hand, is not as 
forgiving or understanding. 

No matter how good or exemplary a man- 
ufacturer’s designs and manufacturing pro- 
cesses might be, if even one in ten million 
products is defective, and that one product 
defect is the cause of a purchaser’s injury, then 
the purchaser may collect damages (Miller 
et al. 1985a). The plaintiff needs to show only 
that the product was unreasonably dangerous 
and that its defect led to harm. In that sense, 
the standard of care for strict product liability 
is 100-percent perfection. To some extent, 
appropriate product labeling (e.g., “Do not 
use this metal ladder near electrical wiring”) 
may protect manufacturers in certain strict 
product liability suits in that clear, visible 
labeling may educate the purchaser to avoid 
“unreasonably dangerous” circumstances. 
Appropriate labeling standards may similarly 
benefit users and manufacturers of clinical 
expert systems (Geissbuhler and Miller 1997). 

Health care software programs sold to cli- 
nicians who use them as decision-support 
tools in their practices are likely to be treated 
under negligence theory as services. When 
advice-giving clinical programs are sold 
directly to patients, however, and there is less 
opportunity for intervention by a licensed 
practitioner, it is more likely that the courts 
will treat them as products, using strict prod- 
uct liability, because the purchaser of the pro- 
gram is more likely to be the individual who is 
injured if the product is defective. (As per- 
sonal health records become more common, 
this legal theory may well be tested.) 

A growing number of software “bugs” in 
medical devices have been reported to cause 
injury to patients (Majchrowski 2010; Levis 
2014). The U.S. Food and Drug Administration 
(FDA) has traditionally viewed software 
embedded within medical devices, such as car- 
diac pacemakers and implantable insulin 
pumps, as part of the physical device, and so 
regulates such software as part of the device 
(FDA 2011). The courts are likely to view such 
software using principles of strict product lia- 
bility (Miller and Miller 2007). Most recently, 
the FDA has contemplated wider regulatory 
scope (see > Sect. 12.5.3). 
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Corresponding to potential strict product 
liability for faulty software embedded in med- 
ical devices is potential negligence liability if 
such software can easily be “hacked” 
(Robertson 2011). Malicious code writers 
might mimic external software-based “radio” 
controllers for pacemakers and insulin pumps 
and reprogram them to cause harm to patients. 
While such “hackers” should face criminal 
prosecution if they cause harm by intention, 
the device manufacturers have a responsibility 
to make it difficult to change the software 
code embedded in devices without proper 
authorization. 


12.5.2.2 Privacy and Confidentiality 


The ethical basis for privacy and confidential- 
ity in health care is discussed in > Sect. 12.3.1. 
For a long time, the legal state of affairs for 
privacy and confidentiality of electronic 
health records was chaotic (as it remains for 
written records, to some extent). This state of 
affairs in the U.S. had not significantly 
changed in the three decades since it was 
described in a classic New England Journal of 
Medicine article (Curran et al. 1969). 
However, a key U.S. law, the Health 
Insurance Portability and Accountability Act 
(HIPAA), has prompted significant change. 
HIPAA’s privacy standards became effective 
in 2003 for most health care entities, and its 
security standards followed 2 years later; a 
breach-notification rule was expanded in 2010 
and HITECH provisions were incorporated in 
2013. A major impetus for the law was that 
the process of “administrative simplification” 
via electronic recordkeeping, prized for its 
potential to increase efficiency and reduce 
costs, would also pose threats to patient pri- 
vacy and confidentiality. Coming against a 
backdrop of a variety of noteworthy cases in 
which patient data were improperly—and 
often embarrassingly—disclosed, the law was 
also seen as a badly needed tool to restore 
confidence in the ability of health profession- 
als to protect confidentiality. While the law 
has been accompanied by debate both on the 
adequacy of its measures and the question 
whether compliance was unnecessarily bur- 
densome, it nevertheless established the first 
nationwide health privacy protections. At its 
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core, HIPAA embodies the idea that individu- 
als should have access to their own health 
data, and more control over uses and disclo- 
sures of that health data by others. Among its 
provisions, the law requires that patients be 
informed about their privacy rights, including 
a right of access; that uses and disclosures of 
“protected health information” generally be 
limited to exchanges of the “minimum neces- 
sary”; that uses and disclosures for other than 
treatment, payment and health care opera- 
tions be subject to patient authorization; and 
that all employees in “covered entities” (insti- 
tutions that HIPAA legally affects) be edu- 
cated about privacy and information security. 

As noted above, the HITECH Act pro- 
vided substantial encouragement for 
Electronic Health Record (EHR) develop- 
ment, particularly the encouragement of bil- 
lions of dollars in federal subsidies for 
“meaningful use” of EHRs. However, 
HITECH also contained many changes to 
HIPAA privacy and security requirements, 
strengthening the regulations that affect the 
collection, use and disclosure of health infor- 
mation not only by covered entities, but also 
the “business associates” (contractors) of 
those covered entities, and other types of 
organizations engaged in health information 
exchange. 

The Office of Civil Rights in the 
U.S. Department of Health and Human 
Services remains the entity primarily charged 
with HIPAA enforcement, but there is now a 
role for states’ attorneys general as well as 
other agencies such as the Federal Trade 
Commission. HITECH increases penalty 
levels under HIPAA and includes a mandate 
for investigations and periodic audits, shifting 
the enforcement balance away from voluntary 
compliance and remediation plans. 

HITECH’s changes to HIPAA, those from 
other federal laws such as the Genetic 
Information Nondiscrimination Act of 2008 
(GINA) and the Patient Safety and Quality 
Improvement Act of 2005, and the new atten- 
tion to information privacy and security in 
most states’ laws, comprise significant changes 
to the legal-regulatory landscape for health 
information. 


12.5.2.3 Copyright, Patents, 
and Intellectual Property 

Intellectual property protection afforded to 
developers of software programs, biomedical 
knowledge bases, and World Wide Web pages 
remains an underdeveloped area of law. 
Although there are long traditions of copy- 
right and patent protections for non-electronic 
media, their applicability to computer-based 
resources is not clear. Copyright law protects 
intellectual property from being copied verba- 
tim, and patents protect specific methods of 
implementing or instantiating ideas. The 
number of lawsuits in which one company 
claimed that another copied the functionality 
of its copyrighted program (i.e., its “look and 
feel”) has grown, however, and it is clear that 
copyright law does not protect the “look and 
feel” of a program beyond certain limits. 
Consider, for example, the unsuccessful suit in 
the 1980s by Apple Computer, Inc., against 
Microsoft, Inc., over the “look and feel” of 
Microsoft Windows as compared with the 
Apple Macintosh interface (which itself 
resembled the earlier Xerox Alto interface). 

It is not straightforward to obtain copy- 
right protection for a list that is a compilation 
of existing names, data, facts, or objects (e.g., 
the telephone directory of a city), unless you 
can argue that the result of compiling the 
compendium creates a unique object (e.g., a 
new organizational scheme for the informa- 
tion) (Tysyer 1997). Even when the compila- 
tion is unique and copyrightable, the 
individual components, such as facts in a 
database, might not be copyrightable. That 
they are not copyrightable has implications 
for the ability of creators of biomedical data- 
bases to protect database content as intellec- 
tual property. How many individual, 
unprotected facts can someone copy from a 
copyright-protected database before legal 
protections prevent additional copying? 

A related concern is the intellectual- 
property rights of the developers of materials 
made available through the World Wide Web. 
Usually, information made accessible to the 
public that does not contain copyright anno- 
tations is considered to be in the public 
domain. It is tempting to build from the work 
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of other people in placing material on the 
Web, but copyright protections must be 
respected. Similarly, if you develop poten- 
tially copyrightable material, the act of plac- 
ing it on the Web, in the public domain, would 
allow other people to treat your material as 
not protected by copyright. Resolution of this 
and related questions may await workable 
commercial models for electronic publication 
on the World Wide Web, whereby authors 
could be compensated fairly when other peo- 
ple use or access their materials. Electronic 
commerce might eventually provide copyright 
protection (and perhaps revenue) similar to 
age-old models that now apply to paper-based 
print media; for instance, to use printed books 
and journals, you must generally borrow them 
from a library, purchase them or access them 
under Creative Commons or similar open- 
access platforms. 


12.5.3 Regulation and Monitoring 
of Computer Applications 
in Health Care 


In the mid-1990s, the U.S. Food and Drug 
Administration (FDA) held public meetings 
to discuss new methods and approaches to 
regulating clinical software systems as medi- 
cal devices. In response, a consortium of pro- 
fessional organizations related to health care 
information (AMIA, the Center for Health 
Care Information Management, the 
Computer-Based Patient Record Institute, the 
American Health Information Management 
Association, the Medical Library Association, 
the Association of Academic Health Science 
Libraries, and the American Nurses 
Association) drafted a position paper pub- 
lished in both summary format and as a lon- 
ger discussion with detailed background and 
explanation (Miller and Gardner 1997a, b). 
The position paper was subsequently endorsed 
by the boards of directors of all the organiza- 
tions (except the Center for Health Care 
Information Management) and by the 
American College of Physicians Board of 
Regents. 
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The 
following: 
= Recognition of four categories of clinical 

system risks and four classes of monitor- 

ing and regulatory actions that can be 
applied based on the level of risk in a given 
setting. 

= Local oversight of clinical software sys- 
tems, whenever possible, through the cre- 
ation of autonomous software oversight 
committees, in a manner partially analo- 
gous to the institutional review boards 
that are federally mandated to oversee pro- 
tection of human subjects in biomedical 
research. Experience with prototypical 
software-oversight committees at pilot 
sites should be gained before any national 
dissemination. 

= Adoption by health care-information sys- 
tem developers of a code of good business 
practices. 

= Recognition that budgetary, logistic, and 
other constraints limit the type and num- 
ber of systems that the FDA can regulate 
effectively. 

= Concentration of FDA regulation on 
those systems posing highest clinical risk, 
with limited opportunities for competent 
human intervention, and FDA exemption 
of most other clinical software systems. 


the 


consortium recommended 


The recommendations for combined local and 
FDA monitoring are summarized in 
O Table 12.1. We do not yet know whether 
improved outcomes would occur if vendors 
were to give qualified (i.e., informatics- 
capable) institutional purchasers greater local 
control over system functionality. 

Section 618 of the 2012 Food and Drug 
Administration Safety and Innovation Act 
(FDASIA), Public Law 112- 144, mandated a 
new generation of oversight guidelines for 
clinical software. Pursuant to the legislation, 
the FDA, the Office of the National 
Coordinator for Health Information 
Technology, and the Federal Communications 
Commission held public hearings and con- 
ducted workshops. The ensuing April 2014 
FDASIA Health IT report (FDASIA 2014) 
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O Table 12.1 
Regulatory class 
Variable A B 
Supervision by FDA Exempt Excluded from 
from regulation 
regulation 
Local software oversight Optional Mandatory 
committee 
Role of software Monitor Monitor locally 
oversight committee locally instead of 
monitoring by 
FDA 
Software risk category 
0: Informational or All software - 


generic systems? 


1: Patient-specific 


in category 


All software in 


systems that provide category 
low-risk assistance 
with clinical problems? 

2: Patient-specific — Locally 
systems that provide developed or 
intermediate-risk locally modified 
support on clinical systems 
problems“ 

3: High-risk, - Locally 
patient-specific developed, non 
systems commercial 

systems 


Cc 


Simple registration 
and postmarket 
surveillance required 


Mandatory 


Monitor locally and 
report problems to 
FDA as appropriate 


Commercially 
developed systems 
that are not modified 
locally 


Consortium recommendations for monitoring and regulating clinical software systems 


D 


Premarket approval 
and postmarket 
surveillance required 


Mandatory 


Assure adequate 
local monitoring 
without replicating 
FDA activity 


Commercial systems 


Reproduced with permission from Author(s). Miller and Gardner (1997a). OAmerican College of Physicians 
‘Includes systems that provide factual content or simple, generic advice (such as “give flu vaccine to eligible 
patients in mid-autumn”) and generic programs, such as spreadsheets and databases 
Systems that give simple advice (such as suggesting alternative diagnoses or therapies without stating prefer- 
ences) and give ample opportunity for users to ignore or override suggestions 
“Systems that have higher clinical risk (such as those that generate diagnoses or therapies ranked by score) but 
allow users to ignore or override suggestions easily; net risk is therefore intermediate 
dSystems that have great clinical risk and give users little or no opportunity to intervene (such as a closed-loop 
system that automatically regulates ventilator settings) 


met Congress’ requirements to propose “strat- 
egy and recommendations on an appropriate, 
risk-based regulatory framework pertaining 
to health-information technology, including 
mobile medical applications, that promotes 
innovation, protects patient safety, and avoids 
regulatory duplication.” The Report imple- 
mented many recommendations from the 
National Academy of Medicine’s 2012 report, 
“Health IT and Patient Safety: Building Safer 
Systems for Better Care” (IOM 2012a). 


The FDASIA Health IT Report specified 
three categories of risk based on system func- 
tionality, rather than software product cate- 
gory or on implementation platform 
(FDASIA 2014). The functionality categories 
are: 

1. Administrative health IT functions. Non- 
exhaustive examples given in the Report 
include billing and claims processing, 
inventory management, and scheduling. 
The FDASIA Report categorizes those 
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functions posing little or no risk to the 
patient, and exempt from additional over- 
sight. 

2. Health management IT functions. Non- 
exhaustive examples cited in the Report 
include encounter documentation, elec- 
tronic access to clinical results, non-device- 
related clinical decision support, 
medication management, provider order 
entry, knowledge management, and elec- 
tronic communication, including health 
information exchange. The FDASIA 
Report asserts that the actual safety risks 
posed by this category are for the most 
part outweighed by their potential bene- 
fits, and require limited national-level 
oversight. Whereas the FDA previously 
played a major regulatory and monitoring 
role for applications with this category of 
functionality, the FDASIA Report trans- 
fers responsibility for such oversight to a 
collaboration between ONC and commer- 
cial vendors. 

3. Medical device health IT functions. Non- 
exhaustive examples listed in the Report 
include computer-assisted detection soft- 
ware, notification of real-time alarms from 
bedside monitors, and robotic surgery sys- 
tems. The FDA will maintain oversight 
responsibility for device-related clinical 
software. 


A given application or product may involve 
more than one of the functionality categories. 

In concordance with the National 
Academy of Medicine 2012 recommendations 
(IOM 2012a), the FDASIA Report also cre- 
ated a public-private Health IT Safety Center, 
to be coordinated by ONC. It will promote 
innovations regarding patient safety and iden- 
tify interventions to improve safety, including 
education about best practices. 


12.5.4 Software Certification 
and Accreditation 


If, as above, (1) there is an ethical obligation 
to evaluate health information systems in the 
contexts in which they are being used, and if, 
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as we just saw, (2) there are good reasons to 
consider the adoption of software oversight 
committees or something similar, then it is 
worthwhile to consider the ethical utility of 
efforts to review and endorse medical soft- 
ware and systems. 

Established in 2004, the Certification 
Commission for Health Information 
Technology, in collaboration with the Office 
of the National Coordinator for health infor- 
mation technology, assesses electronic health 
records according to an array of criteria, in 
part to determine their success in contributing 
to “meaningful use.” These criteria address 
matters ranging from electronic provider 
order entry and electronic problem lists to 
decision support and access control (cf. 
Classen et al. 2007; Wright et al. 2009). The 
criteria, tests and test methods are developed 
in concert with the National Institute of 
Standards and Technology. Practices and 
institutions that want to receive government 
incentive payments must adopt certified elec- 
tronic health record technologies. 

Conceived under the American Recovery 
and Reinvestment Act, these processes aim to 
improve outcomes, safety and privacy. 
Whether they can accomplish this—as 
opposed to celebrate technology for its own 
sake—is an excellent source of debate 
(Hartzband and Groopman 2008). What 
should be uncontroversial is that any system 
of regulation, review or certification must be 
based on and, as a matter of process empha- 
size, certain values. These might include, 
among others, patient-centeredness, ethically 
optimized data management practices, and 
what we have here commended as the “stan- 
dard view,” that is, human beings and not 
machines practice medicine, nursing and 
psychology. 

The move to certification has unfortu- 
nately engendered precious little in the way of 
ethical analysis, however. To make any system 
of regulation, review or certification ethically 
credible, government and industry leaders 
must eventually make explicit that attention 
to ethics is a core component of their efforts. 

An ethical approach to certification of 
clinical applications should entail “in vivo” 
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(on the front lines of clinical care) as well as 
“in vitro” (laboratory-based) testing (IOM 
2012a). As stated in > Sect. 12.2.2, “A com- 
puter program should be used in clinical prac- 
tice only after appropriate evaluation of its 
efficacy and the documentation that it per- 
forms its intended task at an acceptable cost 
in time and money.” Federal and other certifi- 
cation programs currently address, in vitro, 
whether certain pre-specified technical capa- 
bilities exist within a software application. 
They do not, however, determine whether a 
given software package will be usable at an 
affordable cost in time and money in vivo, 
post-installation. 

Almost all vendors’ comprehensive, 
institutional-level clinical information sys- 
tems now pass the minimal federal certifica- 
tion standards. Those certification standards 
test algorithmic functionality while neglecting 
to assess real-world clinical impacts post- 
installation. The 2012 National Academy of 
Medicine report on Health IT and Patient 
Safety noted, “poor usability ... is one of the 
single greatest threats to patient safety. ... 
Evaluation of the impact of health IT on 
usability and on cognitive workload is impor- 
tant to determine unintended consequences 
and the potential for distraction, delays in 
care, and increased workload in general. 
Usability guidelines and principles focused on 
improving safety need to be put into practice” 
(IOM 2012a, p. 81). The certification stan- 
dards also do not fully evaluate the accuracy 
or completeness of systems’ underlying infor- 
mation/knowledge bases. Nor do they evalu- 
ate information/knowledge base accuracy and 
maintainability over time. 

Many institutional-level systems are so 
expensive that they financially cripple the 
medical centers and clinicians’ offices that 
adopt them. Many sites experience substan- 
tial decreases in post-installation revenue 
(usually transient, lasting several months, but 
sometimes persistent). System-induced dis- 
ruptions of workflows diminish the number 
of patients who can be seen, and impair (at 
least temporarily) charge capture for billable 
services. The FDASIA Report minimizing 
decision support oversight fits better with past 
implementations, when academic system 


developers could directly and efficiently 
address local problems of system functional- 
ity. Present-day commercial systems are inflex- 
ible, opaque, and maintained by vendors from 
a distance. The high sticker price of such sys- 
tems guarantees that, once purchased, institu- 
tions cannot afford to de-install and replace 
problematic systems. Government mandates 
to install such expensive, disruptive “certified” 
software systems appear to some as unethical. 
The certification process should be expanded 
to evaluate the pragmatic, local, post- 
installation aspects of system function “at an 
acceptable cost in time and money.” 

The diminished patient-care workflows 
engendered by cumbersome clinical software 
systems potentially increase the time and 
money costs of delivering quality healthcare — 
costs borne by patients, third-party payors, 
and the government. Clinicians often pay a 
higher price — beyond lost revenues. In their 
2017 commentary, “The HITECH Era in 
Retrospect,” Halamka and Tripathi stated, 
“we lost the hearts and minds of clinicians. ... 
We expected interoperability without first 
building the enabling tools. In a sense, we gave 
clinicians suboptimal cars, didn’t build roads, 
and then blamed them for not driving” 
(Halamka and Tripathi 2017). Verghese, Shah, 
and Harrington, in a 2018 JAMA Viewpoint, 
added: “The nationwide implementation of 
electronic medical records (EMRs) resulted in 
many unanticipated consequences ... the 
redundancy of the notes, the burden of alerts, 
and the overflowing inbox has contributed to 
... physician reports of symptoms of burnout. 
... Most EMRs serve their front-line users 
quite poorly” (Vergese et al. 2018). 

Installation of vendors’ massive, complex, 
institutional clinical software products creates 
additional ethical dilemmas. Post-installation, 
clinicians lack a clear or deep understanding 
of how system functions affect their patients’ 
care and safety. Institutions no longer possess 
the level of control/autonomy to change their 
systems locally, as they did decades ago when 
many academic medical centers had home- 
grown systems that they could manage and 
“evolve” at will. Prior to the advent of such 
large, complex clinical systems, clinicians 
directly responsible for a patients care 
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personally knew and supervised all important, 
relevant decision-making. By contrast, after 
system installation, clinicians must obtain sub- 
stantial system-related technical training and 
expertise to be able to access or alter the clini- 
cal knowledge underlying a vendor system’s 
patient-specific recommendations. 

Even more opaque are the hidden mecha- 
nisms for how specific knowledge is brought 
to bear during clinical decision support. For 
example, a system might have a “patient is 
pregnant” indicator. That indicator may trig- 
ger warnings when someone orders medica- 
tions contra-indicated in pregnancy, or when 
physicians order radiological studies deemed 
unsafe for the condition. The ethical issue is 
that most vendor systems obscure the basis 
for how the “patient is pregnant” flag is set 
locally. For example, Hospital A might use the 
nurse’s admission intake interview form to set 
the pregnancy flag. The underlying informa- 
tion gathered by the nurse in such a setting 
might be the patient’s response to the question 
“Are you pregnant now?” The latter is inade- 
quate for patient safety. Hospital B might use 
the result of a patient’s beta-HCG test to set 
the pregnancy flag. While a more reliable indi- 
cator of pregnancy than hearsay from the 
patient, the beta-HCG test may not be ordered 
by every clinician in all relevant circum- 
stances; hence, that mechanism may also be 
imperfect for decision support. A physician 
practicing at both Hospital A and Hospital B 
would be unlikely to know that the flags have 
different meanings at each site. Yet all the cli- 
nician can see is the status of the flag. 

This ethical problem extends far beyond 
setting pregnancy flags. System vendors may 
claim to provide a wide range of patient- 
safety related decision support tools, but 
responsible care providers cannot trust 
system-generated advice when the triggers for 
decision support rules are potentially unreli- 
able and inaccessible. Tort law requires 
clinician-providers to uphold the standard of 
care for their patients. Ignorance of the basis 
for important system-initiated clinical advice 
is inconsistent with upholding the standard of 
care. Certifying agencies should require that 
clinical system vendors make the basis for 
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each decision-support rule transparent. Such 
transparency should require that, without 
special training, a clinical user could easily 
(and on request during system use) determine 
the underlying logic and data supporting each 
instance of patient-specific advice. Systems 
should also enable display of the evidential 
basis for other, more general advice — infor- 
mation that might become outdated over time. 
Vendors must expose, in non-technical terms, 
how decision-support triggers are locally 
determined at each site. 


12.6 Summary and Conclusions 


Ethical issues are important in biomedical 

informatics, and especially so in the clinical 

arena. An initial ensemble of guiding princi- 
ples, or ethical criteria, has emerged to orient 
decision making: 

1. Specially trained human beings (eg., 
licensed practitioners) remain, so far, best 
able to provide health care for other human 
beings. Computer software developers 
should strive to warn caregivers whenever 
it appears that a mistake is imminent. 
However, because clinical practice involves 
as many exceptions as rules, software sys- 
tems should not be allowed to overrule a 
clinician’s decision once a warning has 
been issued. 

2. Practitioners who use informatics tools 
should be clinically qualified and ade- 
quately trained in using the software 
products. 

3. The tools themselves should be carefully 
evaluated and validated, in vitro and 
in vivo. 

4. Health informatics tools and applications 
should be evaluated not only in terms of 
performance, including efficacy, but also in 
terms of their influences on institutions, 
institutional cultures, and workplace social 
forces. 

5. Ethical obligations should extend to sys- 
tem developers, maintainers, and supervi- 
sors as well as to clinician users. 

6. Education programs and security mea- 
sures should be considered essential for 
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protecting confidentiality and privacy 
while improving appropriate access to per- 
sonal patient information. 

7. Adequate oversight should be maintained 
to optimize ethical use of electronic patient 
information for scientific and institutional 
research. 


New sciences and technologies always raise 
interesting and important ethical issues. Much 
the same is true for legal issues, although in 
the absence of precedent or legislation any 
legal analysis will remain vague. Similarly, 
important challenges confront people who are 
trying to determine the appropriate role for 
government in regulating health care soft- 
ware. The lack of clear public policy for such 
software underscores the importance of ethi- 
cal insight and education as the exciting new 
tools of biomedical and health informatics 
become more common. 
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patients’ problems. Journal of Medicine and 
Philosophy. 15, 581-591. This contribution 
lays out the standard view of health informat- 
ics. This view holds, in part, that because only 
humans have the diverse skills necessary to 
practice medicine or nursing, machine intelli- 
gence should never override human clinicians. 

Miller, R. A., Schaffner, K. F., Meisel, A. (1985b). 
Ethical and legal issues related to the use of 
computer programs in clinical medicine. 
Annals of Internal Medicine. 102, 529-536. 
This article constitutes a major early effort to 
identify and address ethical issues in informat- 
ics. By emphasizing the questions of appropri- 
ate use, confidentiality, and validation, among 
others, it sets the stage for all subsequent 
work. 


Q Questions for Discussion 

1. What is meant by the “standard view” 
of appropriate use of medical 
information systems? Identify three 
key criteria for determining whether a 
particular use or user is appropriate. 

2. Can quality standards for system devel- 
opers and maintainers simultaneously 
safeguard against error and abuse and 
stimulate scientific progress? Explain 
your answers. Why is there an ethical 
obligation to adhere to a standard of 
care? 

3. Identify (a) two major threats to patient 
data confidentiality, and (b) policies or 
strategies that you propose for protect- 
ing confidentiality against these threats. 

4. Many prognoses by human beings are 
subjective and are based on faulty mem- 
ory or incomplete knowledge of previ- 
ous cases. What are the two drawbacks 
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to using objective prognostic scoring 
systems to determine whether to allo- 
cate care to individual patients? 

5. People who are educated about their 
illnesses tend to understand and to 
follow instructions, to ask insightful 
questions, and so on. How can the 
World Wide Web improve patient 
education? How, on the other hand, 
might Web access hurt traditional 
physician-patient and nurse-patient 
relationships? 


Acknowledgments Reid Cushman, PhD, 
contributed to this chapter in an earlier edi- 


tion. His comments are gratefully acknowl- 
edged. 


References 


Alpert, S. A. (1998). Health care information: Access, 
confidentiality, and good practice. In K. W. 
Goodman (Ed.), Ethics, computing, and medicine: 
Informatics and the transformation of health care 
(pp. 75-101). Cambridge: Cambridge University 
Press. 

Anderson, J. G., & Aydin, C. E. (1994). Overview: 
Theoretical perspectives and methodologies for the 
evaluation of health care information systems. In 
J. G. Anderson, C. E. Aydin, & S. J. Jay (Eds.), 
Evaluating health care information systems: Methods 
and applications (pp. 346-354). Thousand Oaks: 
Sage. 

Anderson, J. G., & Aydin, C. E. (1998). Evaluating med- 
ical information systems: Social contexts and ethical 
challenges. In K. W. Goodman (Ed.), Ethics, com- 
puting, and medicine: Informatics and the transfor- 
mation of health care (pp. 57-74). Cambridge: 
Cambridge University Press. 

Atreya, R. V., Smith, J. C., McCoy, A. B., Malin, B., & 
Miller, R. A. (2013). Reducing patient re- 
identification risk for laboratory results within 
research datasets. Journal of the American Medical 
Informatics Association, 20, 95-101. 

Benitez, K., & Malin, B. (2010). Evaluating re- 
identification risks with respect to the HIPAA pri- 
vacy rule. Journal of the American Medical 
Informatics Association, 17, 169-177. 

Blois, M. S. (1980). Judgement and computers. New 
England Journal of Medicine, 303, 192-197. 

Blumenthal, D. (2010). Launching HITECH. New 
England Journal of Medicine, 362(5), 382-385. 

Brennan, P. F., Downs, S., & Casper, G. (2010). Project 
HealthDesign: Rethinking the power and potential 


421 


of personal health records. Journal of Biomedical 
Informatics, 43(5 Suppl), S3-S5. 

Brody, B. A. (1989). The ethics of using ICU scoring 
systems in individual patient management. Problems 
in Critical Care, 3, 662-670. 

Classen, D. C., Avery, A. J., & Bates, D. W. (2007). 
Evaluation and certification of computerized pro- 
vider order entry systems. Journal of the American 
Medical Informatics Association: JAMIA, 14(1), 
48-55. 

Curran, W. J., Stearns, B., & Kaplan, H. (1969). Privacy, 
confidentiality, and other legal considerations in the 
establishment of a centralized health-data system. 
New England Journal of Medicine, 281, 241-248. 

Cushman, R., Froomkin, M. A., Cava, A., Abril, P., & 
Goodman, K. W. (2010). Ethical, legal and social 
issues for personal health records and applications. 
Journal of Biomedical Informatics, 43, S51-S55. 

de Dombal, F. T. (1987). Ethical considerations con- 
cerning computers in medicine in the 1980s. Journal 
of Medical Ethics, 13, 179-184. 

Duda, R. O., & Shortliffe, E. H. (1983). Expert systems 
research. Science, 220, 261-268. 

FDA (U.S. Food and Drug Administration). (2011). 
Infusion pump software safety research at FDA. 
Retrieval 18 November 2020: http://www.fda.gov/ 
MedicalDevices/ProductsandMedicalProcedures/ 
GeneralHospitalDevicesandSupplies/ 
InfusionPumps/ucm202511.htm. 

FDASIA. (2014). FDASIA health IT report: Proposed 
strategy and recommendations for a risk-based frame- 
work. Retrieval 18 November 2020: https://www.fda. 
gov/downloads/AboutFDA/CentersOffices/ 
OfficeofMedicalProductsandTobacco/CDRH/ 
CDR HReports/UCM391521.pdf 

Friedman, C. P. (2009). A “fundamental theorem” of 
biomedical informatics. Journal of the American 
Medical Informatics Association, 16(2), 169-170. 

Geissbuhler, A. J., & Miller, R. A. (1997). Desiderata for 
product labeling of medical expert systems. 
International Journal of Medical Informatics, 47(3), 
153-163. 

Glaser, J. (2010). HITECH lays the foundation for more 
ambitious outcomes-based reimbursement. The 
American Journal of Managed Care, 16(12), SP19- 
SP23. 

Goodman, K. W. (1996). Ethics, genomics and informa- 
tion retrieval. Computers in Biology and Medicine, 
26, 223-229. 

Goodman, K. W. (1998). Outcomes, futility, and health 
policy research. In K. W. Goodman (Ed.), Ethics, 
computing, and medicine: Informatics and the trans- 
formation of health care (pp. 116-138). Cambridge: 
Cambridge University Press. 

Goodman, K. W. (2016a). Ethics, medicine, and informa- 
tion technology: Intelligent machines and the trans- 
formation of health care. Cambridge: Cambridge 
University Press. 

Goodman, K. W. (2017). Health information technol- 
ogy as a universal donor to bioethics education. 


12 


422 K. W. Goodman and R. A. Miller 


Cambridge Quarterly of Healthcare Ethics, 26(2), 
342-347. 

Goodman, K. W., & Cava, A. (2008). Bioethics, business 
ethics, and science: Bioinformatics and the future of 
healthcare. Cambridge Quarterly of Healthcare 
Ethics, 17(4), 361-372. 

Goodman, K. W., Berner, E. S., Dente, M. A., Kaplan, 
B., Koppel, R., Rucker, D., Sands, D. Z., & 
Winkelstein, P. (2010). Challenges in ethics, safety, 
best practices, and oversight regarding HIT ven- 
dors, their customers, and patients. Journal of the 
American Medical Informatics Association, 18(\), 
77-81. 

Halamka, J. D., & Tripathi, M. (2017). The HITECH 
era in retrospect. New England Journal of Medicine, 
310, 907-909. 

Hartzband, P., & Groopman, J. (2008). Off the record— 
Avoiding the pitfalls of going electronic. New 
England Journal of Medicine, 358(16), 1656-1658. 

Holroyd-Leduc, J. M., Lorenzetti, D., Straus, S. E., 
Sykes, L., & Quan, H. (2011). The impact of the 
electronic medical record on structure, process and 
outcomes within primary care: A systematic review 
of the evidence. Journal of the American Medical 
Informatics Association, 18, 132-737. 

Ienca, M., & Vayena, E. (2020). On the responsible use 
of digital data to tackle the COVID-19 pandemic. 
Nature Medicine, 26, 463-464. 

IOM (Institute of Medicine). (2012a). Health IT and 
patient safety: Building safer systems for better care. 
Washington, DC: The National Academies Press. 

Kaplan, B., & Harris-Salamone, K. D. (2009). Health IT 
success and failure: Recommendations from litera- 
ture and an AMIA workshop. Journal of the 
American Medical Informatics Association, 16, 291— 
299. 

Knaus, W. A., Wagner, D. P., & Lynn, J. (1991). Short- 
term mortality predictions for critically ill hospital- 
ized adults: Science and ethics. Science, 254, 
389-394. 

Koppel, R., & Kreda, D. (2009). Health care informa- 
tion technology vendors’ “hold harmless” clause: 
Implications for patients and clinicians. JAMA, 301, 
1276-1278. 

Kuperman, G. J., & Gibson, R. F. (2003). Computer 
physician order entry: Benefits, costs, and issues. 
Annals of Internal Medicine, 139(1), 31-39. 

Lane, W. A. (1936). What the mouth reveals. New 
Health, 11, 34-35. 

Levis, J. (Ed.). (2014). HLT or miss: Lessons learned 
from health information technology implementations. 
Chicago: American Health Information 
Management Association. 

Macklin, R. (1992). Privacy and control of genetic 
information. In G. J. Annas & S. Elias (Eds.), Gene 
mapping: Using law and ethics as guides (pp. 157- 
172). New York: Oxford University Press. 


Majchrowski, B. (2010). Medical software’s increasing 
impact on healthcare and technology management. 
Biomedical Instrumentation and Technology, 44(1), 
70-74. 

Malin, B., & Goodman, K. W. (2018). Between access 
and privacy: Challenges in sharing health data. IMITA 
Yearbook of Medical Informatics, 27, 55-59. 

Malin, B., & Sweeney, L. (2004). How (not) to protect 
genomic data privacy in a distributed network: 
Using trail re-identification to evaluate and design 
anonymity protection systems. Journal of Biomedical 
Informatics, 37, 179-192. 

Malin, B., Loukides, G., Benitez, K., & Clayton, E. W. 
(2011). Identifiability in biobanks: Models, mea- 
sures, and mitigation strategies. Human Genetics, 
130, 383-392. 

McNeer, J. F., Wallace, A. G., Wagner, G. S., Starmer, 
C. F, & Rosati, R. A. (1975). The course of acute 
myocardial infarction: Feasibility of early discharge 
of the uncomplicated patient. Circulation, 51, 410- 
413. 

Miller, R. A. (1989). Legal issues related to medical deci- 
sion support systems. International Journal of 
Clinical Monitoring and Computing, 6, 75-80. 

Miller, R. A. (1990a). Why the standard view is stan- 
dard: People, not machines, understand patients’ 
problems. The Journal of Medicine and Philosophy, 
15, 581-591. 

Miller, R. A. (1997). Predictive models for primary care- 
givers: Risky business? Annals of Internal Medicine, 
127(7), 565-567. 

Miller, R. A., & Gardner, R. M. (1997a). Summary rec- 
ommendations for the responsible monitoring and 
regulation of clinical software systems. Annals of 
Internal Medicine, 127(9), 842-845. 

Miller, R. A., & Gardner, R. M. (1997b). Recommen- 
dations for responsible monitoring and regulation of 
clinical software systems. Journal of the American 
Medical Informatics Association, 4, 442-457. 

Miller, R. A., & Goodman, K. W. (1998). Ethical chal- 
lenges in the use of decision-support software in 
clinical practice. In K. W. Goodman (Ed.), Ethics, 
computing, and medicine: Informatics and the trans- 
formation of health care (pp. 102-115). Cambridge: 
Cambridge University Press. 

Miller, R. A., & Miller, S. M. (2007). Legal and regula- 
tory issues related to the use of clinical software in 
health care delivery. In R. A. Greenes (Ed.), Clinical 
decision support: The road ahead (pp. 423-444). 
Boston: Elsevier. 

Miller, R. A., Schaffner, K. F, & Meisel, A. (1985a). 
Ethical and legal issues related to the use of com- 
puter programs in clinical medicine. Annals of 
Internal Medicine, 102, 529-536. 

National Research Council. (1997). For the record: 
Protecting electronic health information. 
Washington, D.C.: National Academy Press. 


Ethics in Biomedical and Health Informatics: Users, Standards, and Outcomes 


Palfrey, J., & Gasser, U. (2010). Born digital: 
Understanding the first generation of digital natives. 
New York: Basic Books. 

Pugh, J., Pycroft, L., Sandberg, A., Aziz, T., & Savulescu, 
J. (2018). Brainjacking in deep brain stimulation 
and autonomy. Ethics and Information Technology, 
20(3), 219-232. 

Robertson, J. (2011). Insulin pumps, monitors vulnera- 
ble to hacking. Associated Press via various media, 
e.g., The Washington Times, August 4. Retrieval 
November 17, 2020: http://www.washingtontimes. 
com/news/201 1/aug/4/insulin-pumps-monitors-vul- 
nerable-to-hacking/?page=all. 

Sackner-Bernstein, J. (2017). Design of hack-resistant 
diabetes devices and disclosure of their cyber safety. 
Journal of Diabetes Science and Technology, 11(2), 
198-202. 

Shih, S. C., McCullough, C. M., Wang, J. J., Singer, J., & 
Parsons, A. S. (2011). Health information systems in 
small practices. Improving the delivery of clinical 
preventive services. American Journal of Preventive 
Medicine, 6, 603-609. 

Shortliffe, E. H. (1993). Doctors, patients, and comput- 
ers: Will information technology dehumanize 
health-care delivery? Proceedings of the American 
Philosophical Society, 137(3), 390-398. 

Shortliffe, E. H. (1994). Dehumanization of patient 
care. Are computers the problem or the solution? 
Journal of the American. Medical Informatics 
Association, 1, 76-78. 

Sittig, D. F., Krall, M., Kaalaas-Sittig, J., & Ash, J. S. 
(2005). Emotional aspects of computer-based pro- 
vider order entry: A qualitative study. Journal of the 
American Medical Informatics Association, 12(5), 
561-567. 


423 


Sittig, D. F, & Singh, H. (2016). A socio-technical 
approach to preventing, mitigating, and recovering 
from ransomware attacks. Applied Clinical 
Informatics, 7(2), 624-632. 

Slayton, T. B. (2018). Ransomware: The virus attacking 
the healthcare industry. Journal of Legal Medicine, 
38(2), 287-311. 

Sweeney, L. (1997). Weaving technology and policy 
together to maintain confidentiality. The Journal of 
Law, Medicine Ethics, 25, 98-110. 

Szolovits, P., & Pauker, S. G. (1979). Computers and 
clinical decision making: Whether, how much, and 
for whom? Proceedings of the IEEE, 67, 1224-1226. 

Tamersoy, A., Loukides, G., Nergiz, M. E., Saygin, Y., & 
Malin, B. (2012). Anonymization of longitudinal 
electronic medical records. IEEE Transactions on 
Information Technology in Biomedicine, 16, 413—423. 

Truog, R. D., Mitchell, C., & Daley, G. Q. (2020). The 
toughest triage — Allocating ventilators in a pan- 
demic. New England Journal of Medicine, 382, 1973- 
1975. 

Tysyer, D. A. (1997). Database legal protection. Bitlaw. 
Retrieval 18 November 2020: http://www.bitlaw. 
com/copyright/database.html 

Vergese, A., Shah, N. H., & Harrington, R. A. (2018). 
What this computer needs is a physician. Journal of 
the American Medical Association, 3191), 19-20. 

Wright, A., Sittig, D. F., Ash, J. S., Sharma, S., Pang, 
J. E., & Middleton, B. (2009). Clinical decision sup- 
port capabilities of commercially-available clinical 
information systems. Journal of the American 
Medical Informatics Association, 16(5), 637-644. 

Youngner, S. J. (1988). Who defines futility? JAMA, 260, 
2094-2095. 


Evaluation of Biomedical 
and Health Information 
Resources 


Charles P. Friedman and Jeremy C. Wyatt 


Contents 


13.1 Introduction - 427 


13.2 Why Are Formal Evaluation Studies Needed? - 428 
13.2.1 Computing Artifacts Have Special Characteristics - 428 
13.2.2 The Special Issue of Safety - 429 


13.3 Two Universals of Evaluation - 430 

13.3.1 The Full Range of What Can Be Formally Studied - 430 

13.3.2 The Structure of All Evaluation Studies, Beginning with a 
Negotiation Phase - 431 


13.4 Deciding What to Study and What Type of Study to Do: 
Questions and Study Types - 432 

13.4.1 The Importance of Identifying Questions - 432 

13.4.2 Selecting a Study Type - 433 

13.4.3 Factors Distinguishing the Nine Study Types - 437 


13.5 Conducting Investigations: Collecting and Drawing 
Conclusions from Data - 439 

13.5.1 Two Grand Approaches to Study Design, Data Collection, 
and Analysis - 439 

13.5.2 Conduct of Objectivist Studies - 440 

13.5.3 Conduct of Subjectivist Studies - 448 


13.6 Communicating Evaluation Results - 452 


13.7 Conclusion: Evaluation as an Ethical and Scientific 
Imperative - 454 


© Springer Nature Switzerland AG 2021 
E. H. Shortliffe, J. J. Cimino (eds.), Biomedical Informatics, https://doi.org/10.1007/978-3-030-58721-5_13 


Appendices - 455 
Appendix A: Two Evaluation Scenarios - 455 
Appendix B: Exemplary Evaluation Studies - 458 


References - 462 


Evaluation of Biomedical and Health Information Resources 


© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= Why are empirical studies based on the 
methods of evaluation and technology 
assessment important to the successful 
implementation of information 
resources to improve health? 

= What challenges make studies in infor- 
matics difficult to carry out? How are 
these challenges addressed in practice? 

= Why can all evaluations be classified as 
empirical studies? 

= What features do all evaluations have in 
common? 

= What are the key factors to take into 
account as part of a process of deciding 
what are the most important questions 
to use to frame a study? 

= What are the major assumptions under- 
lying objectivist and subjectivist 
approaches to evaluation? What are the 
strengths and weaknesses of each 
approach? 

= How does one distinguish measurement 
and demonstration aspects of objectiv- 
ist studies, and why are both aspects 
necessary? In the demonstration aspect 
of objectivist studies, how are control 
strategies used to draw inferences? 

= What steps are followed in objectivist 
and subjectivist studies? What tech- 
niques are employed by investigators to 
ensure rigor and credibility of their 
findings? 

= Why is communication between investi- 
gators and stakeholders central to the 
success of any evaluation? 


13.1 Introduction 

Most people understand the term evalu- 
ation to mean an assessment of an orga- 
nized, purposeful activity. Evaluations are 
usually conducted to answer questions or in 
anticipation of the need to make decisions 
(Wyatt and Spiegelhalter 1990; Ammenwerth 
2015). Evaluations may be informal or for- 
mal, depending on the characteristics of the 
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decision to be made and, particularly, how 
much is at stake. But all activities labeled as 
evaluation involve the empirical process of 
collecting information that is relevant to the 
decision at hand. For example, when choosing 
a holiday destination, members of a family 
may informally ask friends which Hawaiian 
island they prefer and browse various websites 
including those that provide ratings of specific 
destinations. After factoring in costs and con- 
venience, the family reaches a decision. More 
formally, when a health care organization 
faces the choice of a new electronic health 
record system, the leadership will develop a 
plan to collect comparable data about com- 
peting systems, analyze the data according to 
the plan, and ultimately, through a predeter- 
mined process, make a decision. 

The field of biomedical and health infor- 
matics focuses on the collection, processing, 
and communication of health-related infor- 
mation and the implementation of informa- 
tion resources— usually consisting of digital 
technology designed to interact with people— 
to facilitate these activities.! These informa- 
tion resources can collect, store, and process 
data related to the health of individual per- 
sons (institutional or personal electronic 
health records), manage and reason about 
biomedical knowledge (knowledge acquisi- 
tion tools, knowledge bases, decision-support 
systems, and intelligent tutoring systems), and 
support activities related to public health (dis- 
ease registries and vital statistics, disease out- 
break detection and tracking). Thus, there is a 
vast range of biomedical and health informa- 
tion resources that can be foci of evaluation. 

Information resources have many differ- 
ent aspects that can be studied (Friedman 
and Wyatt 2005; Chap. 3). Where safety is 
an issue, as it often is, (Fox 1993; Black et al. 
2011; Russ et al. 2014), we might focus on 


1 In this chapter, we will use the terms “information 
resource” and “information system” generally as 
synonyms. However, “information system” applies 
more specifically to applications of digital technol- 
ogy whereas a “resource” is a broad term that could, 
for example, include informal collegial consulta- 
tions. 


13 


428 C. P. Friedman and J. C. Wyatt 


inherent characteristics of the resource, ask- 
ing such questions as, “Are the code and 
architecture compliant with current software 
engineering standards and practices?” or “Is 
the data structure the optimal choice for this 
type of application?” Clinicians and patients, 
however, might ask more pragmatic ques- 
tions such as, “Is the knowledge in this system 
completely up-to-date?” or “Who can access 
this information besides me?” Executives and 
public officials might wish to understand the 
effects of these resources on individuals and 
populations, asking questions such as, “Has 
this resource improved the quality of care?” 
or “What effects will a patient portal have on 
working relationships between practitioners 
and patients?” Thus, evaluation methods in 
biomedical informatics must address a wide 
range of issues, from technical characteris- 
tics of specific systems to systems’ effects on 
people and organizations. The outcomes or 
effects attributable to the use of health infor- 
mation resources will almost always be a func- 
tion of how individuals choose to use them, 
and the social, cultural, organizational, and 
economic context in which these uses take 
place (Lundsgaarde 1987). 

For these reasons, there is no formula for 
designing and executing evaluations; every 
evaluation, to some significant degree, must 
be custom-designed. A major factor shaping 
the design of evaluations is the decisions the 
evaluation is expected to inform. In the end, 
choices about what evaluation questions to 
pursue and how to collect and analyze data 
to pursue them, are exquisitely sensitive to 
each study’s special circumstances and con- 
strained by the resources that are available for 
it. Evaluation is very much the art of the pos- 
sible. But neither is evaluation an exercise in 
alchemy, pure intuition, or black magic. There 
exist many methods for evaluation that have 
stood the test of time and proved useful in 
practice. There is a literature on what methods 
work and under specific circumstances, and 
there are numerous published examples of 
successful evaluation studies. In this chapter, 
we will introduce many of these methods, and 
present frameworks that guide the application 
of methods to specific decision problems and 
study settings. 


13.2 Why Are Formal Evaluation 
Studies Needed? 


13.2.1 Computing Artifacts Have 


Special Characteristics 


Why are empirical studies of information 
resources needed at all? Why is it not possible, 
for example, to model (and thus predict) the 
performance of information resources and 
their impact on users, and thus save a lot of 
time and effort? The answer lies, to a great 
extent, in the complexity of computational 
artifacts and their use. For some disciplines, 
specification of the structure of an artifact 
allows one to predict how it will function, and 
engineers can even design new objects with 
known performance characteristics directly 
from functional requirements. Examples of 
such artifacts are elevators and conventional 
road bridges. The principles governing the 
behavior of materials and structures made of 
specific materials are sufficiently well under- 
stood that a new elevator can be designed to 
a set of performance characteristics with the 
expectation that it will perform exactly as 
predicted. Laboratory testing of models of 
these devices is rarely needed. Field testing of 
the artifact, once built, is conducted to reveal 
relatively minor anomalies, which can be rap- 
idly remedied, or to tune or optimize perfor- 
mance. However, when the object concerned 
is a computer-based resource, not an elevator, 
the story is different (Littlejohns et al. 2003). 
Software designers and engineers have theo- 
ries linking the structure to the function of 
only the most trivial computer-based resources 
(Somerville 2002). Because of the complexity 
of computer-based systems themselves, their 
position as part of a complex socio-technical 
system including the users and the organiza- 
tion in which they work, and the lack of a com- 
prehensive theory connecting structure and 
function, there is no way to know exactly how 
an information resource will perform until it 
is built and tested (Murray et al. 2004); and 
similarly there is no way to know that any revi- 
sions will bring about the desired effect until 
the next version of the resource is tested. It is 
also impossible to predict how even a perfectly 
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functioning information resource will impact 
user decisions or actions. 

In sum, the only practical way to deter- 
mine if a reasonably complex body of com- 
puter code does what it is intended to do is to 
test it in the laboratory and in the field. This 
testing can take many shapes and forms. The 
informal design, test, and revise activity that 
characterizes the development of all computer 
software is one such form of testing and results 
in software that usually functions as expected 
by the developers. More formal and exhaus- 
tive approaches to software design, verifica- 
tion and testing using synthetic test cases (e.g., 
Scott et al. 2011) and other approaches help to 
guarantee that the software will do what it was 
designed to do. Even these approaches, how- 
ever, do not guarantee the success of the soft- 
ware when put into the hands of the intended 
end-users. This requires more formal studies of 
the types that will be described in this chapter, 
which can be undertaken before, during, and 
after the initial development of an information 
resource. Such evaluation studies can guide 
further development; indicate if the resource is 
likely to be safe for use in real health care, pub- 
lic health, research, or educational settings; 
or elucidate if it has the potential to improve 
the professional performance of the resource 
users and the health of individuals and popu- 
lations. Many stakeholders wish to know if the 
resource, as actually used in practice, has had 
the intended beneficial effects. 

Many other writings elaborate on the 
points offered here. Some of the earliest include 
Spiegelhalter (1983) and Gaschnig et al. (1983) 
who discussed these phases of evaluation by 
drawing analogies from the evaluation of new 
drugs or the conventional software life cycle, 
respectively. Wasson et al. (1985) discussed the 
evaluation of clinical prediction rules together 
with some useful methodological standards 
that apply equally to information resources. 
Many other authors since then have described, 
with differing emphases, the evaluation of 
health care information resources, often focus- 
ing on decision-support tools, which pose 
some of the most extreme challenges. One 
relevant book (Friedman and Wyatt 2005) 
discusses the challenges posed by evaluation 
in biomedical informatics and offers a wide 
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range of methods described in considerable 
detail to help investigators explore and resolve 
these challenges. Other books have explored 
more technical, health technology assessment 
or organizational approaches to evaluation 
methods (Szczepura and Kankaanpaa 1996; 
van Gennip and Talmon 1995; Anderson et al. 
1994; Brender 2005; Harasevich and Pickering 
2017). 


13.2.2 The Special Issue of Safety 


Before disseminating any biomedical informa- 
tion resource that stores and communicates 
health data or knowledge and is designed 
to influence real-world practice or personal 
health decisions, it is important to verify that 
the resource is safe when used as intended. 
In the case of new drugs, European and US 
regulators have imposed a statutory duty on 
developers to perform extensive in vitro test- 
ing, and in vivo testing in animals, before 
any human receives a dose of the drug. Since 
2000, the safety of biomedical information 
resources has come increasingly into the spot- 
light (Rigby et al. 2001; Koppel et al. 2005). 
Accordingly, testing of information resources 
is now being considered, with governmen- 
tal agencies imposing risk-based regulatory 
frameworks and clearer classifications of med- 
ical devices (Slight and Bates 2014; FDASIA 
2014; EU Regulatory Framework 2018). For 
biomedical information resources, safety tests 
analogous to those required for drugs would 
include assessment of the accuracy of the data 
stored and retrieved, measuring the accuracy 
of any risk estimate or advice from a decision 
support system, determining whether and 
how easily end-users can employ the resource 
for its intended purposes, and estimating how 
often the resource furnishes misleading or 
incorrect information (Eminovic et al. 2004). 
It may be necessary to repeat these assess- 
ments following any substantial modifications 
to the information resource, as the correction 
of safety-related problems may itself generate 
new problems or uncover previously unrecog- 
nized ones. 

Determining if an information resource is 
safe and effective goes fundamentally to the 
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process of evaluation we address in this chap- 
ter. Almost all of the methodological issues 
we raise apply to safety assessments. Casual 
assessments that fail to address these issues 
will not resolve the safety question, and will 
not reveal safety defects that can be remedied. 
Many of these issues are issues of sampling 
that we introduce in » Sect. 15.4.2. For exam- 
ple, the advice or other “output” generated by 
most information resources depends critically 
on the quality and quantity of data available 
to it and on the manner in which the resource 
is used by patients or practitioners. People or 
practitioners who are untrained, in a hurry, or 
exhausted at 3 A.M., are more likely to fail to 
enter key data that might lead to the resource 
generating misleading advice, or to fail to heed 
an alarm that is not adequately emphasized 
by the user interface. Coded data automati- 
cally entered into resources may be inaccurate, 
incomplete, or not coded in the manner antici- 
pated by the resource. Thus, to generate valid 
results, functional tests must put the resources 
in actual users’ hands under the most realistic 
conditions possible, or in the hands of people 
with similar knowledge, skills and experience 
if samples of intended users are not available. 
For example, a Facebook advertisement for 
an app to help women detect when conception 
was likely from body temperature readings 
was withdrawn in the UK because the accu- 
racy quoted in publicity materials related to 
ideal use rather than use in everyday practice.” 
[BBC news story 29-8-18]. 

Other safety issues are, from a methodolog- 
ical perspective, issues of measurement that we 
address in > Sect. 13.4.2. For example, should 
“usability” of an information resource be 
determined by documenting that the resource 
development process followed best practices 
to inculcate usability, asking end-users if they 
believed the resource was usable, or by docu- 
menting and studying their “click streams” to 
determine if end-users actually navigated the 
resource as the designers intended? There is no 
single clear answer to this question (see Jackob 


Nielsen’s invaluable resource on user testing’), 
but we will see that all measurement processes 
have features that make their results more or 
less dependable and useful. We will also see 
that the measurement processes built into 
evaluation studies can themselves be designed 
to make the results of the studies more helpful 
to all stakeholders, including those focused on 
safety. 


13.3 Two Universals of Evaluation 


13.3.1 The Full Range of What Can 


Be Formally Studied 


Deciding what to study is fundamentally a 
process of winnowing down from a universe 
of potential questions to a parsimonious set 
of questions that can be realistically addressed 
given the priorities, time, and resources avail- 
able. This winnowing process can begin with 
the full range of what can potentially be stud- 
ied. To both ensure that the most important 
questions do get “on the table” and to help 
eliminate the less important ones, it can be 
useful to start with a comprehensive list. 

While experienced evaluators do not typically 

begin study planning from this broadest per- 

spective, it is always helpful to have a broad 
range of options in mind. 

There are five major aspects of an infor- 
mation resource that can be studied: 

1. Need for the resource: In advance of any 
development, investigators can study the 
status quo absent the resource, includ- 
ing the nature of problems the resource 
is intended to address and how frequently 
these problems arise. (When an informa- 
tion resource is already deployed, the “sta- 
tus quo” might be the currently deployed 
resource, and the resource under study is 
a proposed replacement for it or enhance- 
ment to it.) 

2. Design and development _ process: 
Investigators study the skills of the devel- 


2 » https://www.bbc.co.uk/news/technol- 
ogy-45328965 (Accessed 11.20.19). 


3 > https://www.nngroup.com/articles/ (Accessed 


11.20.19). 


Evaluation of Biomedical and Health Information Resources 


opment team, and the development 
methodologies employed by the team, 
to understand if the resulting resource is 
likely to function as intended. 

3. Resource static structure: Here the focus 
of the evaluation includes specifications, 
flow charts, program code, and other rep- 
resentations of the resource that can be 
inspected without actually running it. 

4. Resource usability and dynamic functions: 
The focus is on whether the resource has 
the potential to be beneficial: the degree 
to which intended end-users can navigate 
the resource and how it performs when it 
is used in pilots prior to full deployment. 

5. Resource use, effect and impact: Finally, 
after deployment, the focus switches from 
the resource itself to the extent of its use and 
its effects on professional, patient or public 
users, and on health care organizations. 


In a theoretically “complete” evaluation, 
sequential studies of a particular resource might 
address all of these aspects, over the life cycle 
of the resource. In the real world, however, it is 
difficult, and rarely necessary, to be so compre- 
hensive. Over the course of its development and 
deployment, a resource may be studied many 
times, with the studies in their totality touching 
on many or most of these aspects. Some aspects 
of an information resource will be studied infor- 
mally using anecdotal data collected via casual 
methods. Other aspects will be studied more 
formally in ways that are purposefully designed 
to inform specific development decisions and 
that involve systematic collection and analysis 
of data. Distinguishing those aspects that will 
be studied formally from those left for informal 
exploration is a challenging task facing all evalu- 
ators. 
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13.3.2 The Structure of All 
Evaluation Studies, 
Beginning witha 
Negotiation Phase 


If the list offered in the previous section can 
be seen as the universe of what can be studied, 
O Fig. 13.1 can be used as a framework for 
planning all evaluation studies. The first stage 
in any study is negotiation between the “inves- 
tigators” (or “evaluators”) who will be carry- 
ing out the study and the “stakeholders” who 
have interests in or otherwise will be concerned 
about the study results. Before a study can pro- 
ceed, the key stakeholders who are supporting 
the study financially and providing other essen- 
tial resources for it—such as the institution 
where the information resource is deployed— 
must be satisfied with the general plan. The 
negotiation phase identifies the broad aim and 
objectives of the study, what kinds of reports 
and other deliverables will result and by when, 
where the study personnel will be based, the 
resources available to conduct the study, and 
any constraints on what can be studied. When a 
study of an information resource is being con- 
ducted internally—that is, when all of the key 
stakeholders represent one organization that 
also employs the investigators—it is still very 
useful to have an internal negotiation to lay out 
details of the study. 

The results of the negotiation phase are 
expressed in a document, generally known as 
a contract between the evaluators and the key 
stakeholders. The contact guides the planning 
and execution of the study and, in a very sig- 
nificant way, protects all parties from misun- 
derstandings about intent and execution. Like 
any contract, an evaluation contract can be 
changed later with consent of all parties. 


© Fig. 13.1 Generic _ Design and 
structure of all evaluation Complete Identify | „| Select Lo} Conduct l> Communicate 
studies Negotiation Questions Study Types Investiaätion Results 
box: 13.3.2 box: 13.4.1 box: 13.4.2 9 box: 13.6 
box: 13.5 
Contract Stakeholder 
Decisions 
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Following the negotiation process and its 
reflection in a contract, the planning of the 
evaluation proceeds in a sequence of logi- 
cal steps, starting with the formulation of 
specific questions to be addressed, then the 
selection of the type(s) of study that will be 
used, the investigation that entails the collec- 
tion and analysis of data, and ultimately the 
communication back to the stakeholders of 
the findings, which typically inform a range 
of decisions. Although @ Fig. 13.1 portrays 
a one-way progression through this sequence 
of stages, in the real world of evaluation there 
are often detours and backtracks. 


13.4 Deciding What to Study and 
What Type of Study to Do: 
Questions and Study Types 


13.4.1 The Importance of 


Identifying Questions 


Once the study’s objective, scope and other 
applicable “ground rules” have been estab- 
lished, the real work of study planning 
can begin. The next step, as suggested by 
O Fig. 13.1, is to convert the perspectives of 
the concerned parties, and what these indi- 
viduals or groups want to know, into a finite, 
specific set of questions. It is important to rec- 
ognize that, for any evaluation setting that is 
interesting enough to merit formal evaluation, 
the number of potential questions is infinite. 
This essential step of identifying a tractable 
number of questions has a number of benefits: 
= Jt helps to crystallize thinking of both 
investigators and key members of the 
audience who are the stakeholders in the 
evaluation. 
= It guides the investigators and stakeholders 
through the critical process of assigning 
priority to certain issues and thus 
productively narrowing the focus of a 
study. 
= It converts broad statements of aim (e.g., 
“to evaluate a new order communications 
system”) into specific questions that can 
potentially be answered (e.g., “ What is the 
impact of the order communications system 


on how clinical staff spend their time, the 
rate and severity of adverse drug events and 
the length of patient stay?”). 

= It allows different stakeholders in the 
evaluation process—patients, professional 
groups, managers — to see the extent to 
which their own concerns are being 
addressed, and to ensure that these feed 
into the evaluation process. 

= Most important, perhaps, it is hard if not 
impossible to develop investigative 
methods without first identifying quest- 
ions, or at least focused issues, for 
exploration. The choice of methods 
follows from the evaluation questions: not 
from the novel technology powering the 
information resource or the type of 
resource being studied. Unfortunately, 
some investigators choose to apply the 
same set of the methods to any study, 
irrespective of the questions to be 
addressed, or even to limit the evaluation 
questions addressed to those compatible 
with the methods they prefer. We do not 
endorse this limiting approach. 


Consider the distinction made earlier between 
informal evaluations that people undertake 
continuously as they make choices as part of 
their everyday personal or professional lives, 
and more formal evaluations that are planned 
and then executed according to that plan. In 
short, formal evaluations are those that con- 
form to the architecture of @ Fig. 13.1. In 
these formal evaluations, the questions that 
actually get addressed survive a narrowing pro- 
cess that begins with a broad set of candidate 
questions. When starting a formal evaluation, 
therefore, a major decision is whom to consult 
to establish the questions that will get “on the 
table”, how to log and analyze their views, and 
what weight to place on each of these views. 
There is always a wide range of potential play- 
ers in any evaluation (see ® Box 13.1) and 
there is no formula that defines whom to con- 
sult or in what order. Through this process, the 
investigators apply their common sense and, 
with experience, learn to follow their instincts. 
The only universal mistake is to fail to consult 
one or more of the key stakeholders, especially 
those paying for the study or those ultimately 
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making the key decisions to be informed by 
the evaluation. It is often useful to establish 
a group to advise and guide the evaluators, a 
group with broad representation that will help 
ensure that study remains true to the interests 
and preferences of the stakeholders. 

Through discussions with various stake- 
holder groups, the hard decisions regarding 
the questions to be addressed in the study are 
made. A significant challenge for investigators 
is the risk of getting swamped by detail, result- 
ing from the multiplicity of questions that can 
be asked in any study. To manage through the 
process, it is important to reflect on the major 
issues identified after each round of discus- 
sions with stakeholders, and then identify 
the questions that map to these issues. Where 
possible, keep questions at the same level of 
granularity. 

It is critical that the specific questions serv- 
ing as the beacon guiding the study be deter- 
mined and endorsed by all key stakeholders, 
before any significant decisions about the 
detailed design of the study are made. We 
will see later that evaluation questions can, in 
many circumstances, change over the course 
of a study; but that fact does not obviate the 
need to specify a set of questions at the out- 
set. > Appendix A describes two evaluation 
scenarios, and suggests some evaluation ques- 
tions that may be appropriate for each. 


Box 13.1 Some ofthe Potential Players 

in an Evaluation Study 

= Those commissioning the evaluation 
study, who will typically have questions 
or decisions that rely on the data col- 
lected 

== Those paying for the evaluation study 

= Those paying for the development 
and/or deployment of the information 
resource 

= End-users of the resource, who are 
often providers of data for the study 

= Developers of the resource and their 
managers 

= Care providers and their managers 

= Staff responsible for resource imple- 
mentation and user training 
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= Information technology staff and 
leaders in the organization where the 
resource is deployed 

= Senior managers in the organization 
where the resource is deployed 

= The patients whose care the resource 
may directly or indirectly influence 

= Staff in ancillary services whose work- 
load may be affected by resource 
deployment, for example laboratory or 
imaging departments following deploy- 
ment of a diagnostic decision support 
system 

= Quality improvement and safety pro- 
fessionals in the organization in which 
the resource is implemented 


13.4.2 Selecting a Study Type 


After developing the list of evaluation ques- 
tions, the next step is to understand which 
study type(s) the evaluation questions natu- 
rally invoke. The study types we will introduce 
in this chapter are specific to the evaluation 
of information resources, and are particularly 
informative to the design of evaluation stud- 
ies in biomedical informatics. These study 
types are described below and also summa- 
rized in @ Table 13.1. The second column 
of 8 Table 13.1 links the study types to the 
aspect of the resource that is studied—as pre- 
viously introduced in » Sect. 13.3.1. Each 
study type is likely to appeal to certain inter- 
ests of particular stakeholders, as suggested 
in the rightmost column of the table. A wide 
range of data collection and analysis meth- 
ods, as discussed later in > Sect. 13.5, can be 
used to answer the questions embraced by all 
nine study types. Choice of a study type typi- 
cally does not constrain the methods that can 
be used to collect and analyze data. And we 
will see later in this chapter, and specifically in 
> Sect. 13.5.2.2, that all of these study types 
are what can be called demonstration studies, 
in contrast to so-called measurement studies. 
Finally, a set of studies, exemplifying many of 
these study types, is introduced and described 
in > Appendix B. 
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O Table 13.1 Classification of demonstration study types by broad study question and the stakeholders 


most concerned 


Study type Aspect studied Broad study question 

1. Needs Need for the What is the problem? 
assessment resource 

2. Design Design and Is the development 
validation development method in accord with 

process accepted practices? 

3. Structure Resource static Is the resource 

validation structure appropriately designed 
to function as intended? 
4. Usability Resource Can intended users 


test dynamic usability navigate the resource so 
and function it carries out intended 
functions? 

5. Laboratory Resource Does the resource have 
function dynamic usability the potential to be 
study and function beneficial? 

6. Field Resource Does the resource have 
function dynamic usability the potential to be 
study and function beneficial in the real 

world? 

7. Lab user Resource effect Is the resource likely to 


effect study 


oo 


. Field user 
effect study 


and impact 


Resource effect 
and impact 


change user behavior? 


Does the resource 
change actual user 
behavior in ways that are 
positive? 


9. Problem Resource effect Does the resource have a 
impact and impact positive impact on the 
study original problem? 


Audience/stakeholders primarily interested 
in results 


Resource developers, funders of the 
resource 


Funders of the resource; professional and 
governmental certification agencies e.g., 
Food and Drug Administration, Office of 
the National Coordinator for HIT 


Professional indemnity insurers, resource 
developers; professional and 
governmental certification agencies 


Resource developers, users, funders 


Resource developers, funders, users, 
academic community 


Resource developers, funders, users 


Resource developers and funders, users 


Resource users and stakeholders, resource 
purchasers and funders 


The universe of stakeholders 


Needs assessment studies seek to clarify 
the information problem the resource is 
intended to solve. These studies take place 
before the resource is designed—usually 
in the setting where the resource is to be 
deployed, although simulated settings may 
sometimes be used. Ideally, the potential 
users of the resource will be studied while 
they work with real problems or cases, 
to understand better how information is 
used and managed, and to identify the 
causes and consequences of inadequate 
information flows. The investigator seeks 
to understand users’ skills, knowledge 


and attitudes, as well as how they make 
decisions or take actions. To ensure that 
developers have a clear model of how a 
proposed information resource will fit with 
working practices and structures, they may 
also need to study health care or research 
processes, team functioning, or relevant 
aspects of the larger organization in which 
work is done (Wyatt et al. 2010). 

Design validation studies focus on the 
quality of the processes of information 
resource design and development, for 
example by asking experts to review these 
processes. The experts may review docu- 
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ments, interview the development team, 
compare the suitability of the software 
engineering methodology and program- 
ming tools used with others that are avail- 
able, and generally apply their expertise 
to identify potential shortcomings in the 
approach used to develop the software, as 
well as constructively to suggest how these 
shortcomings might be corrected. 


. Structure validation studies address the 


static form of the software, usually after 
a first prototype has been developed. 
This type of study is most usefully per- 
formed by an expert or a team of experts 
with experience in developing software 
for the problem domain and concerned 
users. For these purposes, the investiga- 
tors need access to both summary and 
detailed documentation about the sys- 
tem architecture, the structure and func- 
tion of each module, and the interfaces 
among them. The expert might focus on 
the appropriateness of the algorithms 
that have been employed and check that 
they have been correctly implemented 
by examining the code and its docu- 
mentation. Experts might also exam- 
ine the data structures (e.g., whether 
they are appropriately normalized) and 
knowledge bases (e.g., whether they are 
evidence-based, up to date, and modelled 
in a format that will support the intended 
analyses or reasoning). Most of this will 
be done by inspection and discussion 
with the development team. Sometimes 
specialized software may be used to test 
the structure of the resource (Somerville 
10th edition 2015). 

Note that the study types listed up to 
this point do not require a functioning infor- 
mation resource. However, beginning with 
usability testing below, the study types 
require the existence of at least a function- 
ing prototype. 

. Usability testing studies address whether 
intended users can actually operate or 
navigate the software, to determine 
whether the resource has the potential to 
be helpful to them (see also > Chap. 5). In 
this type of study, testing of a prototype 
by typical users informs further develop- 
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ment and should improve its usability. 
Although usability testing can be per- 
formed by obtaining opinions of usability 
experts who “test drive” the resource, 
usability can also be tested by deploying 
the resource in a laboratory or classroom 
setting, introducing users to it, and then 
allowing them either to navigate at will 
and provide unstructured comments or to 
attempt to complete some scripted tasks 
(see extensive material by Nielsen, > www. 
useit.com). Data can be collected by the 
computer itself, from the user, by a live 
observer, via audio or video capture of 
users’ actions and statements, or by spe- 
cialized instrumentation such as eye-track- 
ing tools. Many software developers have 
usability testing labs equipped with sophis- 
ticated measurement systems, staffed by 
experts in human computer interaction to 
carry out these studies—an indication of 
the importance increasingly attached to 
this type of study (Zhang et al. 2003; 
Saitwal et al. 2010). 


. Laboratory function studies go beyond 


usability to explore more specific aspects 
of the information resource, such as the 
quality of data captured, the speed of 
communication, the validity of the 
calculations carried out, or the 
appropriateness of the results or advice 
given. These functions relate less to the 
basic usability of the resource and more to 
how the resource performs in relation to 
what it is trying to achieve for the user or 
the organization. When carrying out any 
kind of function testing, real or proxy 
users are employed. The study results 
depend crucially on what problems the 
users are asked to solve, so the “tasks” 
employed in these studies (eg. case scenar- 
ios) should correspond as closely as possi- 
ble to those to which the resource will be 
applied in real working life. Such tasks very 
across a set of dimensions—for example, 
difficulty, problem domain, and urgency— 
so it is very important to employ in these 
studies a set of tasks that spans the range of 
these dimensions. 


. Field function studies are a variant of lab- 


oratory function testing in which the 
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resource is “pseudo-deployed” in a real 
work place and employed by real users 
with real problems—but only up to a 
point. In field function tests, although the 
resource is used by real users with real 
tasks, the users have no immediate access 
to the output or results of their interaction 
with the resource that might influence 
their real decisions or actions, so no effects 
on these can occur. The output is recorded 
for later review by the investigators, and 
perhaps by the users themselves. 

Studies of the effect or impact of infor- 
mation resources on users and problems are in 
many ways the most demanding. As the focus 
of study moves from function testing, which 
is always hypothetical, to possible effects on 
health decisions or care processes, the conduct 
of research, or educational practice, there is 
often the need to establish cause and effect 
and to submit studies to external review. 

In laboratory user effect studies, simulated 
user decisions or actions are studied. 
Practitioners employ the resource in a lab- 
oratory setting and are asked what they 
“would do” with the results or advice the 
resource generates, but no action is taken. 
Laboratory user effect studies can be con- 
ducted with prototype or released versions 
of the resource, outside the practice envi- 
ronment. Although such studies involve 
individuals who are representative of the 
“end-user” population, the primary results 
of the study derive from simulated actions, 
so the care of patients or conduct of 
research is not affected by a study of this 
type. An example is a study in which junior 
physicians viewed realistic prescribing sce- 
narios and interacted with a simulated pre- 
scribing tool while they were exposed to 
simulated prescribing alerts of various 
kinds and the rate of prescribing errors 
was measured (Scott et al. 2011). 

In a field user effect study, the actual 
actions or decisions of the users of the 
resource are studied after the resource is 
formally deployed. This type of study pro- 
vides an opportunity to test whether the 
resource is actually used by the intended 
users, whether they obtain accurate and 


useful information from it, and whether 
this use affects their decisions and actions 
in significant ways. In field user effect stud- 
ies, the emphasis is on the behaviors and 
actions of users, and not the health out- 
comes or consequences of these behaviors. 
For example, one study examined the 
impact of SMS reminders on anti-retroviral 
medication adherence in Africans with 
HIV and showed a dramatic improvement 
(Lester et al. 2010). 


. Problem impact studies are similar to field 


user effect studies in many respects, but 
differ profoundly in the questions that are 
the focus of exploration. Problem impact 
studies examine the extent to which the 
original health problem that motivated 
creation or deployment of the information 
resource has been addressed. Often this 
requires investigation that looks beyond 
the actions of care providers, researchers, 
or patients to examine the consequences 
of these actions. In the Lester study of 
SMS alerts (Lester et al. 2010), increased 
adherence to antiretroviral therapy (a user 
action) was also accompanied by improved 
viral load suppression. However, user 
effects cannot be assumed to engender 
problem impacts. For example, an 
information resource designed to reduce 
medication errors may affect the behavior 
of clinicians who employ the resource 
relative to those who do not, but for a vari- 
ety of reasons, the actual incidence of 
harmful medication episodes remains 
unchanged. In such an instance, clinical 
pharmacists who review orders may be 
catching and correcting these errors before 
patients are affected. In other examples, 
individuals may be motivated to exercise 
through interaction with a wearable infor- 
mation resource but fail to meet weight 
loss objectives because they cannot afford 
concomitant changes in their diets. In still 
other domains, an information resource 
may be widely used by researchers to 
access biomedical information, as deter- 
mined by a user effect study, but a subse- 
quent problem impact study may or may 
not reveal effects on scientific productivity. 
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New educational technology may change 
the ways students learn, but may or may 
not increase their performance on stan- 
dardized examinations. Problem impact 
studies, as well as user effect studies, will 
be sensitive to unintended consequences. 
Sometimes, the solution to the target 
problem creates other, unintended and 
unanticipated problems that can affect 
perceptions of success. As electronic mail 
became an almost universal mode of com- 
munication, almost no one anticipated the 
problems of “spam” or “phishing”. 
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13.4.3 Factors Distinguishing 
the Nine Study Types 


O Table 13.2 further distinguishes the nine 
study types, as described above, using a set of 
key differentiating factors discussed in detail 
in the paragraphs that follow. 


The setting in which the study takes 
place Studies of the design process, the 
resource structure, and many resource func- 
tions are typically conducted outside the active 


© Table 13.2 Factors distinguishing the nine demonstration study types 


Study type Study setting Version of the 
resource 
1. Needs Field None, or 
assessment pre-existing 
resource to be 
replaced 
2. Design Development None 
validation lab 
3. Structure Lab Prototype or 
validation released 
version 
4. Usability Lab Prototype or 
test released 
version 
5. Laboratory Lab Prototype or 
function released 
study version 
6. Field Field Prototype or 
function released 
study version 
7. Lab user Lab Prototype or 
effect study released 
version 
8. Field user Field Released 
effect study version 
9. Problem Field Released 
impact version 


study 


Sampled Sampled What is observed 

users tasks 

Anticipated Actual tasks User skills, knowledge, 

resource decisions or actions; 

users care processes, costs, 
team function or 
organization; patient 
outcomes 

None None Quality of design 
method or team 

None None Quality of resource 
structure, components, 
architecture 

Proxy, real Simulated, Speed of use, user 

users abstracted comments, completion 
of sample tasks 

Proxy, real Simulated, Speed and quality of 

users abstracted data collected or 
displayed; accuracy of 
advice given... 

Proxy, real Real Speed and quality of 

users data collected or 
displayed; accuracy of 
advice given... 

Real users Abstracted, Impact on user 

real knowledge, simulated/ 

pretend decisions or 
actions 

Real users Real Extent and nature of 
resource use. Impact on 
user knowledge, real 
decisions, real actions 

Real users Real Impact on targeted 


health status 
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health care or decision environment, in a “labo- 
ratory” setting. Studies to elucidate the need for 
a resource and studies of its impact on users 
would usually take place in settings—known 
generically as the “field”—where health care 
practitioners, researchers, students, patients or 
administrators are making real choices in the 
real world. These studies can take place only in 
settings where the resource is available for use 
and where health care or health behavior activ- 
ities occur and/or where other important deci- 
sions are made. To an investigator planning 
such studies, an important consideration that 
determines the kind of study possible is the 
degree of access to resource users in the field 
setting. If, as a practical matter, access to the 
field setting is very limited, then several study 
types listed in @ Tables 13.1 and 13.2 are either 
not possible, or the validity of the field studies 
that are possible will be reduced. 


The version of the resource used For some 
kinds of studies, a simulated or prototype ver- 
sion of the resource may be sufficient (Scott 
et al. 2011; Russ et al. 2014), whereas for studies 
in which the resource is employed by intended 
users to support real decisions and actions, a 
fully robust and reliable version is needed (e.g., 
Lester et al. 2010). 


The sampled resource users Information 
resources nearly always function through 
interaction with one or more such “users” 
who bring to the interaction their own domain 
knowledge and knowledge of how to operate 
the resource. Exceptions might include closed 
loop control systems such as smart insulin 
pumps and pharmacy robots, but even in these 
cases, humans who set the parameters for the 
otherwise autonomous operations of these 
devices can be seen as a form of “users”. In 
some types of evaluation studies, the users of 
the resource are not the end users for whom 
the resource is ultimately designed, but are 
members of the development or evaluation 
teams, or other individuals who can be called 
“proxy users”, chosen because they are conve- 
niently available or because they are afford- 
able. (For example, senior medical students 


can sometimes be used as proxies for more 
experienced physicians.) In other types of 
studies, the users are sampled from the end- 
users for whom the resource is ultimately 
designed. The type of users employed gives 
shape to a study and can affect its results pro- 
foundly. The usability of a resource is easily 
overestimated if the “users” in a study are 
those who designed or are otherwise familiar 
with the resource. As another example, volun- 
teer users of a consumer-oriented resource 
such as a dieting app may be more motivated 
than the general population the resource is 
designed to benefit. 


The sampled tasks For function, effect, and 
impact studies, the users included in the study 
actually interact with the resource. This requires 
tasks, often clinical or scientific decision prob- 
lems, for the users to undertake. These tasks 
can be invented or simulated; they can be 
abstracted versions of real cases or problems, 
shortened to suit the specific purposes of the 
study; or they can be live cases or research 
problems as they present to resource users in 
their everyday work. Clearly, the kinds of tasks 
employed, and how they are sampled, in a 
study have serious implications for the study 
results and the conclusions that can be drawn 
from them. 


The observations that are made All evalua- 
tion studies entail observations that generate 
data that are subsequently analyzed to gener- 
ate the study results. As seen in © Table 13.2, 
many different kinds of observations can be 
made. 

In the paragraphs above we have intro- 
duced the term “sampled” for both tasks and 
users. It is important to establish that in real 
evaluation studies, tasks and users are always 
sampled from some real or hypothetical pop- 
ulation. Choosing appropriate methods to 
sample users and tasks is a major challenge in 
evaluation study design since it is never pos- 
sible, practical, or desirable to study everyone 
doing everything possible with an information 
resource. Sampling issues are addressed later 
in this chapter. 
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13.5 Conducting Investigations: 
Collecting and Drawing 
Conclusions from Data 


13.5.1 Two Grand Approaches 
to Study Design, Data 


Collection, and Analysis 


Several authors have developed classifica- 
tions, or typologies, of evaluation methods 
or approaches. Among the best is that devel- 
oped in 1980 by Ernest House (1980). Even 
though it is somewhat old, a major advantage 
of House’s typology is that each approach is 
linked elegantly to an underlying philosophi- 
cal model, as detailed in his book. This clas- 
sification divides current practice into eight 
discrete approaches, four of which may be 
viewed as objectivist and four of which may 
be viewed as subjectivist. While the distinc- 
tions between the eight approaches House 
describes are beyond the scope of this chap- 
ter, the grand distinction between objectivist 
and subjectivist approaches is very important. 
Note that these approaches are not entitled 
“objective” and “subjective”, because those 
labels carry strong and fundamentally mis- 
leading connotations of scientific precision in 
the former case and of idiosyncratic impreci- 
sion in the latter. We will see in this section 
how both objectivist (often called quantita- 
tive) and subjectivist (often called qualitative) 
approaches find rigorous application across 
the range of study types described earlier. 

To appreciate the fundamental differ- 
ence between the approaches, it is necessary 
to address their very different philosophical 
roots. The objectivist approaches derive from 
a logical-positivist philosophical orienta- 
tion—the same orientation that underlies the 
classic experimental sciences. The major prem- 
ises underlying the objectivist approaches are 
as follows: 
= In general, the attributes of interest are 

properties of the resource under study, or 

the people interacting with it. More 
specifically, this premise suggests that the 
merit and worth of an information 
resource—the attributes of most interest 
in evaluation—can in principle be mea- 
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sured, with all observations yielding the 
same result. It also assumes that an 
investigator can measure these attributes 
without affecting how the resource under 
study functions or is used. 

= Rational persons can and should agree 
on what attributes of a resource are 
important to measure and what results 
of these measurements would be 
identified as the most desirable, correct, 
or positive outcome. In informatics, 
making this assertion is tantamount to 
stating that perfection in resource or 
user performance can always be identified 
and that all rational individuals can be 
brought to consensus on what 
“perfection” is. 

= Because numerical measurement allows 
precise statistical analysis of performance 
over time or performance in comparison 
with some alternative, numerical meas- 
urement is prima facie superior to a verbal 
description. Verbal, descriptive data 
(generally known as qualitative data) are 
thus useful in only preliminary studies to 
identify hypotheses for subsequent, more 
precise analysis using quantitative methods. 

= Through these kinds of comparisons, it is 
possible to demonstrate to a reasonable 
degree that a resource is or is not superior 
to what it replaced, or to a competing 
resource. 


Contrast these assumptions with the set of 

assumptions that derives from an intuition- 

ist-pluralist or de-constructivist philosophi- 

cal position that spawns a set of subjectivist 

approaches to evaluation: 

= What is observed about a resource 
invariably depends in fundamental ways 
on the observer. Different observers of the 
same resource might legitimately come to 
different conclusions. Both can be objective 
in their appraisals even if they do not 
agree; it is not necessary that one is right 
and the other wrong. Important insight 
can derive from both, and from their 
juxtaposition. 

= Merit and worth must be explored in 
context. The value of a resource emerges 
through study of the resource as it 


13 


440 C. P. Friedman and J. C. Wyatt 


functions in a particular decision-making 
environment. 

= Individuals and groups can legitimately 
hold different perspectives on what 
constitutes the most desirable outcome of 
introducing a resource into an environment. 
There is no reason to expect them to agree, 
and it may be counterproductive to even 
try to lead them to consensus. An important 
aspect of an evaluation would be to 
document the ways in which they disagree. 

= Verbal description can be highly illuminating. 
Qualitative data are valuable, in and of 
themselves, and can lead to conclusions as 
convincing as those drawn from quantitative 
data. The value of qualitative data, therefore, 
goes far beyond that of identifying issues for 
later more “precise” exploration using 
quantitative methods. 

= Evaluation should be viewed as an exercise 
in argument or rhetoric, rather than as a 
demonstration, because every study can 
“appear equivocal when subjected to 
serious scrutiny” (House 1980). 


The approaches to evaluation that derive 
from this subjectivist philosophical perspec- 
tive may seem strange, imprecise, and unsci- 
entific when considered for the first time. 
This perception stems in large part from 
the widespread acceptance of the objectiv- 
ist worldview in biomedicine. Over the last 
two decades, however, thanks to some early 
high quality studies (eg., Forsythe et al. 


O Fig. 13.2 Generic 
structure depicting an 
objectivist investigation 


1992; Sheikh et al. 2011; Wright et al. 2015) 
the importance and utility of these subjec- 
tivist approaches in evaluation have been 
established within biomedical informatics. 
It is important for people trained in classic 
experimental methods at least to understand, 
and possibly even to embrace, the subjectivist 
worldview if they are to conduct fully infor- 
mative evaluation studies. 


13.5.2 Conduct of Objectivist 
Studies 


O Figure 15.2 expands the generic process for 
conducting evaluation studies to illustrate the 
steps involved in conducting an objectivist study. 
O Figure 13.2 illustrates the linear sequence in 
which the investigation portion of an objectivist 
evaluation study is typically carried out. We will 
focus in this chapter on issues of study design 
that are most challenging in objectivist stud- 
ies and we will further focus on that subset of 
objectivist study designs which are comparative 
in nature. More details on the other aspects of 
objectivist studies are available in standard ref- 
erences on experimental design (Campbell and 
Stanley 1963; Rothman et al. 2008). 


13.5.2.1 Structure and Terminology 
of Comparative Studies 

Most objectivist evaluations performed in 
the world make a comparison of some type. 


For informatics, aspects of performance of 
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individuals, groups, or organizations with the 
information resource are often compared to 
those same aspects without the resource, with 
some alternative resource, or an alternate 
design of the same resource. After identify- 
ing a sample of participants for the study, the 
investigator assigns each participant, often 
randomly, to one or a set of conditions and 
some outcomes of interest are measured for 
each participant. The averaged values of these 
outcomes are then compared across the con- 
ditions. If all other factors are controlled, 
either directly through the design of the study 
or statistically through randomization, then 
any measured difference in the averaged out- 
comes can be attributed to the resource. 

This relatively simple description of a 
comparative study belies the many issues that 
affect their design, execution, and ultimate 
usefulness. To understand these issues, we 
must first develop a precise terminology. 

The participants in a study are the enti- 
ties about which/whom data are collected. It 
is key to emphasize that participants are often 
people—for example, care providers or recipi- 
ents—but also may be information resources, 
groups of people, or organizations. Because 
many of the activities in informatics are con- 
ducted in hierarchical settings with naturally 
occurring groups (a “physician’s patients”; 
the “researchers in a laboratory”), investiga- 
tors must, for a particular study, define the 
participants carefully and consistently. 

Variables are specific characteristics of 
the participants or the setting that either are 
measured purposefully by the investigator or 
are self-evident properties that do not require 
measurement. Some variables may take a con- 
tinuous range of values while others have a 
discrete set of levels, corresponding to each 
of the possible measured values. For example, 
in a hospital setting, physician members of a 
ward team can be classified as residents, fel- 
lows, or attending physicians. In this case, the 
variable “physician’s level of qualification” is 
said to have three discrete “levels”. 

The dependent variables are those vari- 
ables in the study that capture the outcomes 
of interest to the investigator. (For this rea- 
son, dependent variables are also called out- 
come variables.) A study may have one or 
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more dependent variables. In a typical study, 
the dependent variables will be computed, for 
each participant, as an average over a number 
of tasks. For example, clinicians’ diagnostic 
performance may be measured over a set of 
cases, or “tasks”, that provide a range of diag- 
nostic challenges. 

The independent variables are included 
in a study to explain the measured values 
of the dependent variables. For example, 
whether an information resource is avail- 
able, or not, to support certain clinical tasks 
could be the major independent variable in a 
study designed to evaluate the effects of that 
resource. 

Measurement challenges almost always 
arise in the assessment of the outcome or 
dependent variable for a study. Often, for 
example, the dependent variable is some type 
of performance measure that invokes con- 
cerns about reliability (precision) and validity 
(accuracy) of measurement. The indepen- 
dent variables may also raise measurement 
challenges. When the independent variable is 
marital status, for example, the measurement 
problems are relatively straightforward. If the 
independent variable is an attitude or other 
“state of mind”, such as computer or health 
literacy, profound measurement challenges 
can arise. 


13.5.2.2 Issues of Measurement 


Measurement is the process of assigning a 
value corresponding to the presence, absence, 
or degree of a specific attribute in a specific 
object. When we speak specifically of measure- 
ment, it is customary to use the term “object” 
to refer to the entity on which measurements 
are made. Measurement usually results in 
either (1) the assignment of a numerical score 
representing the extent to which the attribute 
of interest is present in the object, or (2) the 
assignment of an object to a specific category. 
Taking and recording the temperature (attri- 
bute) of a person (object) is an example of the 
process of measurement. 

From the premises underlying objectiv- 
ist studies (see ® Sect. 13.5.1), it follows that 
proper execution of such studies requires 
careful and specific attention to methods of 
measurement. It can never be assumed that 
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attributes of interest are measured without 
error. Accurate and precise measurement 
must not be an afterthought and indeed, most 
scientific progress occurred once challenging 
measurement problems, such as measuring 
the speed of light or mass of an electron, were 
solved. Measurement is of particular impor- 
tance in biomedical informatics because, as 
a relatively young field, informatics does not 
have a well-established tradition of “variables 
worth measuring” or proven instruments for 
measuring them. By and large, people plan- 
ning studies in informatics are faced first with 
the task of deciding what to measure and then 
with that of developing their own measure- 
ment methods. For most researchers, these 
tasks prove to be harder and more time-con- 
suming than initially anticipated. 

We can underscore the importance of 
measurement by establishing a formal distin- 
ction between studies undertaken to develop 
methods for making measurements, which we 
call measurement studies, and the subsequent 
use of these methods to address questions of 
direct importance in informatics, which we 
call demonstration studies. Measurement 
studies seek to determine how accurately 
and precisely an attribute of interest can be 
measured in a population of objects. In an 
ideal objectivist measurement, which never 
actually occurs, all observers will agree on 
the result of the measurement. Any disagree- 
ment is therefore due to error, which should 
be minimized. The more agreement among 
observers or across observations, the better 
the measurement. Measurement procedures 
developed and validated through measure- 
ment studies provide researchers with the 
measurement instruments they need to con- 
duct demonstration studies that directly 
address questions of substantive and practi- 
cal concern to the stakeholders for an evalua- 
tion study. Once we know how accurately we 
can measure an attribute using a particular 
procedure and instrument, we can employ 
the measured values of this attribute as a 
variable in a demonstration study to draw 
inferences about the performance, percep- 
tions, or effects of an information resource. 
For example, once measurement studies have 
determined how accurately and precisely the 


usability of a class of information resources 
can be measured, using a specific measure- 
ment method, a subsequent demonstration 
study could explore which of two resources 
that are members of this class has greater 
usability. 

A detailed discussion of measurement 
methods and issues is beyond the scope of 
this chapter but these topics are discussed in 
the Friedman and Wyatt textbook previously 
referenced. The bottom line is that investi- 
gators should know that their measurement 
methods will be adequate before they col- 
lect data for their studies. If the measures 
to be used do not have an established track 
record it is necessary to perform a measure- 
ment study, involving data collection on a 
small scale, to establish the adequacy of all 
measurement procedures (e.g., Ramnarayan 
et al. 2003; Demiris et al. 2000). Even if the 
measurement procedures of interest do have 
a track record in a particular setting, they 
may not perform equally well in a different 
environment, so a further measurement study 
may still be necessary. Researchers should 
always ask themselves, “How good are my 
measures in this particular setting?” whenever 
they are planning a study, before they proceed 
to the demonstration phase. The importance 
of measurement studies for informatics was 
first explained by Michaelis and co-workers 
(1990) and later expanded by Friedman and 
Abbas (2003). A study by Scott et al. (2019) 
documented that informatics studies continue 
to underappreciate the importance of mea- 
surement issues. 

Whenever possible, investigators plan- 
ning demonstration studies should employ 
established measurement methods with a 
“track record”, by re-using them, rather 
than by developing their own. Increasingly, 
compendia of measurement instruments 
specifically for health informatics are avail- 
able on the Internet.* 


4 Examples include: > https://www.gem-beta.org/pub- 
lic/home.aspx and > https://healthit.ahrq.gov/ 
health-it-tools-and-resources/evaluation-resources/ 
health-it-survey-compendium-search (Both accessed 
11.20.18). 
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13.5.2.3 Sampling Strategies 

a Selection of Participants 

The participants selected for objectivist stud- 
ies must resemble those to whom the evalu- 
ator and others responsible for the study 
wish to apply the results. For example, when 
attempting to quantify the likely impact of a 
clinical information resource on clinicians at 
large, there is no point in studying its effects 
on the clinicians who helped develop it, espe- 
cially if they built it, as they are likely to be 
much more familiar with the resource than 
average practitioners. Characteristics of clini- 
cal participants that typically need to be taken 
into account include age, experience, role, 
type of work environment, attitude toward 
digital information resources, and extent of 
their involvement in the development of the 
resource. Analogous factors would apply to 
patients or health care consumers as partici- 
pants. 


a Volunteer Effect 

A common bias in the selection of participants 
is the use of volunteers. It has been established 
in many areas that people who volunteer as 
participants, whether to complete question- 
naires, participate in psychology experiments, 
or test-drive new cars or other technologies, are 
atypical of the population at large (e.g., Pinsky 
et al. 2007). Although evaluations are often 
the “art of the possible”, and all participants 
in studies are ultimately volunteers in the sense 
that no one can or should be coerced to partic- 
ipate, it is important to take steps to make the 
study participants as representative as possible 
of the resource’s ultimate user community. A 
systematic approach to participant selection 
would first identify the full population of users 
and then sample from that population either 
randomly or sometimes purposively to be sure 
the sample includes participants with charac- 
teristics seen as essential to a thorough test of 
the resource. Once a sample is selected, follow- 
up to invitation letters and other mechanisms 
can achieve as close to 100% recruitment of 
the selected sample as possible. Relatively 
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modest financial incentives can significantly 
boost participation rates. 


= Number of Participants Needed 

The financial investment required for an evalu- 
ation study depends critically on the number 
of participants needed. The required number 
in turn depends on the purpose and design of 
the study. In usability studies, discussed below, 
a great deal can be learned from a relatively 
small sample.° In subjectivist studies, partici- 
pant selection can be a dynamic process where 
study participants identify other participants. 
In objectivist user effect or problem impact 
studies, sample sizes are directed by the pre- 
cision of the answer required from the study 
and the risk investigators are willing to take of 
failing to detect a significant effect. (All other 
things being equal, the larger the sample size, 
the greater the likelihood of detecting an effect 
of a specified size using a predetermined cri- 
terion for statistical significance.) Statisticians 
can advise on this point and carry out power 
analyses that estimate the sample-size required. 
Sometimes, in order to recruit the required 
number of participants, an element of vol- 
unteer effect must be tolerated; often there is 
a trade-off between obtaining a sufficiently 
large sample and ensuring that the sample is 
representative. Also, the impact of sample size 
on effect detection is non-linear. The value of 
adding, say, 10 more representative partici- 
pants to a sample of 100 is far less than that of 
adding 10 more participants to a sample of 30. 


= Selection of Tasks 

In the same way that participants must be 
carefully selected to resemble the people 
likely to use the information resource, any 
tasks the participants complete in the study 
must also resemble those that will generally be 
encountered where the information resource 
is deployed. Thus when evaluating a clini- 
cal order-entry system intended for general 
use, it would be unwise to use only complex 
cases from, for example, a pediatric inten- 
sive care setting. Although the order-entry 


5 See » https://www.nngroup.com/articles/why-you- 
only-need-to-test-with-5-users/ (Accessed 11.20.18). 
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system might well be of considerable benefit 
in intensive care cases, it is inappropriate to 
generalize results from such a limited sample 
to the full range of cases seen in ambulatory 
pediatrics. An instructive example is pro- 
vided by the study of Van Way et al. (1982) 
who developed a scoring system for diagnos- 
ing appendicitis and studied the resource’s 
accuracy using exclusively patients who had 
undergone surgery for suspected appendicitis. 
Studying this group of patients had the ben- 
efit of allowing the true cause of the abdomi- 
nal pain to be obtained with near certainty as 
a by-product of the surgery itself. However, in 
these patients who had all undergone surgery 
for suspected appendicitis the symptoms were 
more severe and the incidence of appendicitis 
was five to ten times higher than for the typi- 
cal patient for whom such a scoring system 
would be used. Thus, the accuracy obtained 
with postsurgical patients would be a poor 
estimate of the system’s accuracy in routine 
clinical use. 

If the performance of an information 
resource is measured on a number of hand- 
picked tasks, the functions it performs may 
appear spuriously complete and its usability 
overestimated. This is especially likely if these 
cases are similar to, or even identical with, a 
“training” set of tasks used to develop or tune 
the information resource before the evaluation 
is carried out. When a statistical model that 
powers an information resource is carefully 
adjusted to achieve maximal performance on 
training data, this adjustment may worsen its 
accuracy on a fresh set of data due to a phe- 
nomenon called overfitting (Wasson 1985; 
Srivastava et al. 2014; Ravi et al. 2017). Thus, 
it is important to obtain a new set of tasks 
and evaluate performance on this new test set, 
a process called cross-validation. Sometimes 
developers omit tasks from a sample if they 
do not fall within the scope of the information 
resource, for example if the final diagnosis for a 
case is not represented in a diagnostic system’s 
knowledge base. This practice violates the prin- 
ciple that a test set should be representative of 
all tasks in which the information resource will 
be used, and will overestimate its accuracy in 
the real world. 


13.5.2.4 Control Strategies 
in Comparative Studies 

One of the most challenging questions in 
quantitative comparative study design is how 
to obtain control (Liu et al. 2011). In the con- 
text of informatics, control mechanisms seek to 
account for all factors in a study environment 
that are not attributable to the information 
resource. In the following sections, we review 
a series of control strategies. We employ, as a 
running example of an information resource 
under study, a reminder system that prompts 
physicians to order prophylactic antibiotics for 
orthopedic patients to prevent postoperative 
infections. In this example, the intervention is 
the deployment of the reminder system; the 
participants are the physicians; and the tasks 
are the surgical cases. The dependent variables 
are physicians’ ordering of antibiotics (a user 
effect measure) and the rate of postoperative 
infections (a problem impact measure). As 
such, this is an example of a study which strad- 
dles two of the types in @ Table 13.1. 


= Descriptive (Uncontrolled) Studies 

In the simplest possible design, an uncon- 
trolled or descriptive study, we deploy the 
reminder system and then make our mea- 
surements. There is no independent variable 
as such. Suppose that we discover that the 
overall postoperative infection rate is 5% and 
that physicians order prophylactic antibiot- 
ics in 60% of orthopedic cases. Although we 
have two measured dependent variables, it is 
hard to draw meaningful conclusions from 
these results. Although the results might be 
informative from a patient safety perspective, 
is not possible to draw any conclusions about 
the effect of the resource. 


= Historically Controlled Experiments 

As a first improvement to a descriptive study, 
from the perspective of control, consider a 
historically controlled experiment, sometimes 
called a before-after study (Wyatt & Wyatt 
2003). The investigator makes baseline measure- 
ments of antibiotic ordering and postoperative 
infection rates before the information resource 
is installed, and then makes the same measure- 
ments after the information resource is in rou- 
tine use. The independent variable is time and 
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has two levels: before and after resource instal- 
lation. Let us say that, at baseline, the postop- 
erative infection rates were 10% and physicians 
ordered prophylactic antibiotics in only 40% of 
cases; but the post-intervention figures are 5% 
and 60%, respectively (see © Table 13.3). 

The investigators may claim that the halv- 
ing of the infection rate can be safely ascribed 
to the information resource, especially because 
it was accompanied by a substantial improve- 
ment in physicians’ antibiotic prescribing. 
Many other factors might, however, have 
changed in the interim to cause these results, 
especially if there was a long interval between 
the baseline and post-intervention measure- 
ments. New staff could have been employed; 
the mix of patients could have changed; new 
prophylactic antibiotics may have been intro- 
duced; quality improvement meetings may 
have highlighted the infection problem and 
thus caused greater clinical awareness; and/or 
incentive programs may have been introduced 
to reward prescribing. Simply assuming that 
the reminder system alone caused the reduc- 
tion in infection rates is naive. 


= Simultaneous Nonrandomized Controls 
To address some of the problems with histori- 
cal controls, we might use simultaneous con- 
trols, which require additional measurements 
to be made with a new group of physicians 
and their patients who are not influenced by 
the prophylactic antibiotic reminder system- 
-but who are subject to any other changes tak- 
ing place in the environment. 

This study design would be a parallel 
group comparative study with simultane- 
ous controls; if the physicians are given 
the option to choose whether to use the 


O Table 13.3 Results from a hypothetical 
before-after study of the impact of reminders on 
post operative infection rates 


Prescribing Infection 
rate (%) rate (%) 
Baseline 40 10 
Post- 60 5 


intervention 
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reminder system or not, it is a case con- 
trol study. @ Table 13.4 gives hypothetical 
results of such a study, focusing on postop- 
erative infection rates as a single outcome 
measure or dependent variable. The inde- 
pendent variables are time and group, both 
of which have two levels. 

O Table 13.4 illustrates improvement in 
the group where reminders were available, 
but no improvement—indeed a slight deter- 
ioration—where no reminders were available. 
This design provides suggestive evidence of an 
improvement that is most likely to be due to 
the reminder system. 

However, even though the controls in this 
example are simultaneous, attribution of the 
effect to the information resources remains 
refutable because there may be some system- 
atic, unknown difference between the clinicians 
and/or patients in the two groups. For example, 
if the two groups comprised the patients and 
clinicians in two adjacent wards, the difference 
in the infection rates could be attributable to 
differences between the wards. Perhaps hospi- 
tal-staffing levels improved in some wards but 
not in others, or there was cross infection by 
a multiple-resistant organism only among the 
patients in the control ward. To overcome such 
criticisms, we could try to measure everything 
that happens to every patient in both wards and 
to build complete profiles of all staff to rule out 
systematic differences. Even then, attribution 
of the effect to the information resources would 
be vulnerable to the accusation that some vari- 
able that we did not measure—and did not 
even know about—explains the difference. An 
alternative strategy is to make the intervention 
and control groups statistically comparable by 
randomizing them. 


O Table 13.4 Results of a hypothetical 
non-randomized parallel group study of 
reminders and post op infection rates 


Reminder Control 
group (%) group (%) 
Baseline rate 10 10 
Post- 5 11 


intervention rate 
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a Simultaneous Randomized Controls 

The crucial problem in the previous example is 
that, although the controls were simultaneous, 
there may have been systematic, unmeasured 
differences between them and the participants 
receiving the intervention (Liu and Wyatt 2011). 
A simple and effective way of removing sys- 
tematic differences, whether due to known or 
unknown factors, is to randomly assign partici- 
pants to control or intervention groups. Thus, we 
could randomly allocate one-half of the physi- 
cians to receive the antibiotic reminders and the 
remaining physicians to work as they did before. 
We would then measure and compare postop- 
erative infection rates in patients managed by 
physicians in the reminder and control groups. 
Provided that the physicians care only for their 
assigned patients, any difference that is statisti- 
cally “significant” (conventionally, a result that 
is statistically determined to have a probability 
of 0.05 or less of occurring by chance) can be 
attributed reliably to the reminders. 

O Table 13.5 shows the hypothetical 
results of such a study. The baseline infec- 
tion rates in the patients managed by the two 
groups of physicians are similar, as we would 
expect, because the patients were allocated 
to the groups by chance. There is a greater 
reduction in infection rates in patients of 
reminder physicians compared with those 
of control physicians. Because strict random 
assignment means that there was no system- 
atic difference in physician or patient charac- 
teristics between groups, the only systematic 
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difference between the two groups of patients 
is receipt of reminders by their physicians. 
Provided that the number of patients is 
large enough to provide a sufficient number of 
events (post op infections) for these results to 
be statistically significant (about 250 infections 
observed overall), we would conclude with 
some confidence that providing physicians with 
reminders caused the reduction in infection 
rates. The small reduction, from baseline to 
installation, in infection rates in control cases 
is not unexpected, even in a perfectly random- 
ized study. It could reflect changes in practice 
policy that affected both groups, some cross- 
talk among physicians who work in the same 
setting, or the Hawthorne Effect (whereby 
people’s performance often improves when it 
is studied). These phenomena occur in the real 
world of evaluation and should be expected. 
However, because the pre-post difference in the 
reminder group was larger, an effect due to the 
information resource is likely to have occurred. 


13.5.2.5 Drawing Conclusions 
from Observational Data: 
Real World Evidence 


Demonstration studies using a planned, pro- 
spective data collection process share a number 
of challenges, including their cost, inevitable 
delays setting up the study and recruiting par- 
ticipants--and the concern that, because of the 
volunteer effect or other biases, the results will 
not confidently generalize to typical resource 
users, patients, or care settings. The use of so- 
called observational data from routinely gen- 
erated patient care records--often linked to 
administrative or other data sources--could 
overcome many of these limitations and offer a 
more economical and faster method to address 
evaluation questions related to information 
resources. Ideally, such studies can be performed 
retrospectively with all necessary data drawn 
from existing data repositories and no further 
data collection required. These methods give 
rise to what is coming to be called Real World 
Evidence (RWE). As ever-increasing amounts 
of routinely collected data become available in 
coded digital forms, there is growing interest in 
RWE (Sherman et al. 2016). 

RWE methods are increasingly important 
but also have important limitations. In the 
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absence of the kinds of controls described in 
the previous section of this chapter, it difficult 
to attribute any observed user effects or prob- 
lem impacts to a specific cause. The analysis 
of observational data typically results in a 
pattern of correlations among the variables 
included in an analysis. This pattern of corre- 
lations must be interpreted with great care. For 
example, if Factor A (for example, extent of 
use of a decision support system) is correlated 
with outcome O (for example, fewer medica- 
tion errors), A is not necessarily a direct cause 
of O. It may be the case that some Factor B 
(for example, clinical workload), which was 
not included in the study but which is corre- 
lated with A, is the true cause of O. There is, 
however, no way to know this, because Factor 
B was not included in the data set used for the 
study. This phenomenon is known as unmea- 
sured confounding, and is the primary source 
of concern when putative causal conclusions 
are drawn from observational data. 

Other concerns arise with the quality of 
data drawn from documentation of routine 
care. Data entered by care providers them- 
selves under pressure of time will not be 
expected to be as accurate and precise as data 
entered by an undistracted research assistant 
paid to be careful observer. Also, incomplete- 
ness in observational data may not be ran- 
domly distributed, so simply increasing the 
sample size or the range of included variables 
may make matters worse, not better. A further 
consideration is confounding by indication, in 
which the (usually unconscious) prejudice of 
clinicians leads to biased prescribing or recom- 
mendations to patients to use new, expensive 
or risky apps, online consultations or other 
digital services for those patients with the 
best — or the worst — prognosis. In these cases, 
a straightforward analysis will overestimate 
effectiveness, so a technique called propensity 
scoring is needed (McMurry et al. 2015). 

Methods from the field of econometrics 
can be very helpful for establishing causal rela- 
tionships between variables in observational 
studies. One approach is instrumental vari- 
able (IV) methods that attempt to identify an 
“instrument” that usually determines whether 
a treatment is given (Davey Smith et al. 2007). 
A clinical example might arise when trying to 
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estimate the benefit of bone marrow trans- 
plant (BMT) in acute myeloid leukemia in 
children without performing a randomized 
trial. It can be argued that one can compare 
mortality in this condition between children 
with and without a living sibling, since nearly 
all children with this leukemia and a live sib- 
ling will get aBMT and those without a sib- 
ling are much less likely to have a successful 
transplant. Since the presence of a living sib- 
ling is unrelated to whether a child has the dis- 
ease, this kind of observational study might 
be almost as informative asa randomized trial 
for determining whether BMT is effective [3]. 
An informatics example would be if some dia- 
betic patients are covered for online consulta- 
tions by their health insurance while others 
are not. As long as we are satisfied that there 
are no systematic differences in disease sever- 
ity or treatment adherence between the two 
patient groups, we could use the IV method 
to estimate the impact of online consultation 
on diabetes control and progression, assum- 
ing that those who are covered to use online 
consultations will usually take that option. 
However, the major challenge in designing 
an IV study is to identify an instrument that 
fulfils the following essential criteria: it usu- 
ally dictates whether the intervention is given, 
does not affect the outcome except via the 
intervention, and is not correlated in any way 
with the outcome, or with other factors that 
cause it (Streeter et al. 2017; Gray et al. 2019). 

In the age of “big data”, observational 
studies generating Real World Evidence will 
become increasingly important, but from the 
perspective of evaluation in informatics, they 
will apply only to field studies, and principally 
to user effect and problem impact studies. The 
seven other study types will remain largely reli- 
ant on prospective methods, although prospec- 
tive studies too can benefit from the existence 
of data marts by incorporating already-avail- 
able data whenever possible. As data quality 
improves, data marts become more compre- 
hensive, and methods to establish causation 
gain increased sophistication, RWE methods 
will continue to mature. For studies addressing 
health outcomes--what this chapter refers to 
as problem impact studies--it is almost inevi- 
table that Real World Evidence approaches will 
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O Fig. 13.3 Generic 
structure depicting a 
subjectivist investigation 
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assume an important place alongside random- 
ized designs and in the best possible scenario, 
the two will complement each other. 


13.5.3 Conduct of Subjectivist 
Studies 


The objectivist comparative approaches to 
evaluation, described in the previous section, 
are useful for addressing some, but not all, of 
the interesting and important questions that 
challenge investigators in medical informat- 
ics. The subjectivist approaches described in 
this section address the problem of evaluation 
from a very different set of premises. They 
use different but equally rigorous methods. 
O Figure 13.3 expands the generic process 
for conducting evaluation studies to illustrate 
the stages involved in conducting a subjectiv- 
ist study and emphasizes the “iterative loop” 
of data collection, analysis and reflection as 
the major distinguishing characteristic of a 
subjectivist investigation. Another distinctive 
feature of subjectivist studies is an immersion 
in the environment where the resource has 
been or will be deployed. Because subjectiv- 
ist approaches may be less familiar to readers, 
we describe subjectivist studies in more detail 
than we did their objectivist counterparts. 


The Rationale 

for Subjectivist Studies 
Subjectivist methods enable us to address 
the deeper questions that arise in infor- 
matics: the detailed, individualistic “whys” 


13.5.3.1 
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and “according to whoms” in addition to 
the aggregate “whethers” and “whats.” 
Subjectivist approaches seek to represent the 
viewpoints of people who are users of the 
resource or are otherwise significant partici- 
pants in the environment where the resource 
operates. The goal is illumination rather than 
judgment. The investigators seek to build an 
argument that promotes deeper understand- 
ing of the information resource or environ- 
ment of which it is a part. The methods used 
derive largely from ethnography (Forsythe 
1992; Ventres et al. 2006; Pope et al. 2013). The 
investigators immerse themselves physically in 
the environment (the “field”) where the infor- 
mation resource is or will be operational, and 
collect data primarily through observations, 
interviews, or reviews of documents. The 
designs—or data-collection plans—of these 
studies are not rigidly predetermined and do 
not unfold in a fixed sequence. They develop 
dynamically and nonlinearly, as the investiga- 
tors’ experience in the field accumulates. 


13.5.3.2 A Rigorous, but Different, 
Methodology 

These subjectivist approaches to evaluation, 
like their objectivist counterparts, are empiri- 
cal methods. Although it is easy to focus 
only on their differences, these two broad 
classes of evaluation approaches share many 
features. In all empirical studies, for exam- 
ple, data are collected with great care; the 
investigators are always aware of what they 
are doing and why. The data are then com- 
piled, interpreted, and ultimately reported. 
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Investigators keep records of their proce- 
dures, and these records are open to audit by 
the investigators themselves or by individuals 
outside the study team. The principal inves- 
tigator or evaluation-team leader is under an 
almost sacred scientific obligation to report 
the study methods. Failure to do so will inval- 
idate a study. Both classes of approaches also 
share a dependence on theories that guide 
investigators to explanations of the observed 
phenomena, as well as to a dependence on the 
pertinent literature such as published studies 
that address similar phenomena or similar 
settings. In both approaches, there are rules 
of good practice that are generally accepted; 
it is therefore possible to distinguish a “good” 
study from a bad one. 

There are, however, fundamental differ- 
ences between objectivist and subjectivist 
approaches. First, subjectivist studies are 
emergent in design. Objectivist studies typi- 
cally begin with a set of hypotheses or specific 
questions, and with a plan for addressing each 
member of this set. The investigator assumes 
that, barring major unforeseen developments, 
the plan will be followed exactly. Deviation 
would be seen as a potential source of bias. For 
example, an objectivist investigator who sees 
negative results emerging from the exploration 
of a particular question or use of a particular 
measurement instrument might be inclined to 
change strategies in hope of obtaining more 
positive findings. In contrast, subjectivist stud- 
ies typically begin with general orienting issues 
that stimulate the early stages of investiga- 
tion. Through these initial investigations, the 
important questions for further study emerge. 
The subjectivist investigator is willing, at virtu- 
ally any point, to adjust future aspects of the 
study in light of the most recent information 
obtained, while carefully recording that this 
had happened and why. Subjectivist investiga- 
tors tend to be incrementalists; they thought- 
fully change their plans as necessary from 
day-to-day and have a high tolerance for ambi- 
guity and uncertainty. In this respect, they are 
much like good software developers. Also like 
software developers, subjectivist investigators 
must develop the ability to recognize when a 
project is finished, when further benefit can 
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be obtained only at too great a cost in time, 
money, or work. 

A second feature of subjectivist studies 
is a naturalistic orientation: a reluctance to 
manipulate the setting of the study, which in 
informatics is typically the environment into 
which the information resource is introduced. 
Subjectivist studies do not alter the environ- 
ment to study it. Control groups, placebos, 
purposeful altering of information resources 
to create contrasting interventions, and other 
techniques that are central to the construction 
of objectivist studies typically are not used. 
Subjectivist studies will, however, employ 
quantitative data for descriptive purposes and 
may offer quantitative comparisons when the 
research setting offers a “natural experiment” 
where such comparisons can be made without 
deliberate manipulation. For example, when 
physicians and nurses both use a clinical sys- 
tem to enter orders, the differing experiences 
of the two professional groups offer a natural 
basis for comparison. Subjectivist researchers 
are opportunists where pertinent information 
is concerned; they will use what they see as 
the best information available to illuminate a 
question under investigation. 

A third important distinguishing feature 
of subjectivist studies is that their end prod- 
uct is a report written in narrative prose. 
While these reports may be lengthier than 
the statistical reports from objectivist studies, 
no technical understanding of quantitative 
research methodology is required to compre- 
hend them. Results of subjectivist studies are 
therefore accessible—and may even be enter- 
taining—to a broad community in a way that 
results of objectivist studies are not. Reports 
of subjectivist studies seek to engage their 
audience. 


13.5.3.3 Natural History 
of a Subjectivist Study 

O Figure 13.3 illustrates the stages that char- 
acterize a subjectivist study (see also Chap. 
9 in Friedman and Wyatt 2005). These stages 
constitute a general sequence, but, as we men- 
tioned, subjectivist investigators must always 
be prepared to revise their thinking and pos- 
sibly to return to earlier stages in light of new 
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data or insights resulting from its analysis. 

Backtracking is a legitimate step in this model. 

1. Negotiation of the ground rules of the study: 
The understanding between the study 
team and the persons commissioning a 
study should embrace the general aims of 
the study; the kinds of methods to be used; 
access to various sources of information, 
including health care providers, patients, 
and various documents; and the format 
for interim and final reports. The aims of 
the study may be formulated in a set of ini- 
tial orienting questions. Ideally, this under- 
standing will be expressed in a 
memorandum of understanding, analo- 
gous to a contract. 

2. Immersion into the environment. At this 
stage, the investigators begin spending 
time in the work environment. Their activ- 
ities range from formal introductions to 
informal conversations, or to silent pres- 
ence at meetings and other events. 
Investigators use the generic term field to 
refer to the setting, which may be multiple 
physical locations, where the work under 
study is carried out. Trust and openness 
between the investigators and the people in 
the field are essential elements of subjec- 
tivist studies to ensure full and candid 
exchange of information. 

Even as immersion is taking place, 
the investigator is already collecting data 
to sharpen the initial questions or issues 
guiding the study. Early discussions with 
people in the field, and other activities pri- 
marily targeted toward immersion, inevita- 
bly begin to shape the investigators’ views. 
Almost from the outset, the investigator is 
typically addressing several aspects of the 
study simultaneously. 

3. Iterative loop: At this point, the procedural 
structure of the study becomes akin to an 
iterative loop, as the investigator engages 
in cycles of data collection, analysis and 
reflection, “member checking”, and reor- 
ganization. Data collection involves inter- 
view, observation, document analysis, and 
other methods. Data are collected on 
planned occasions, as well as serendipi- 
tously and spontaneously. The data are 
recorded carefully and are interpreted in 


the context of what is already known. 
Analysis and reflection entail the contem- 
plation of the new findings during each 
cycle of the loop. Member checking is the 
sharing of the investigators emerging 
thoughts and beliefs with the participants 
themselves. Reorganization results in a 
revised agenda for data collection in the 
next cycle of the loop. 

Although each cycle within the itera- 
tive loop is depicted as unidirectional, this 
representation is misleading. Net progress 
through the loop is clockwise, as shown in 
O Fig. 13.3, but backward steps are natu- 
ral and inevitable. They are not reflective 
of mistakes or errors. An investigator may, 
after conducting a series of interviews and 
studying what participants have said, decide 
to speak again with multiple participants to 
clarify their positions on a particular issue. 
Communicate results: Subjectivist studies 
tend to have a multi-staged reporting and 
communication process. The first draft of 
the study report should itself be viewed as 
a research instrument. By sharing this 
report with a variety of individuals, the 
investigator obtains a major check on the 
validity of the findings. Typically, reac- 
tions to the preliminary report will gener- 
ate useful clarifications and a general 
sharpening of the study findings. Because 
the report usually includes a prose narra- 
tive, it is vitally important that it be well 
written in language understandable by all 
intended audiences. Circulation of the 
report in draft, for comments by the 
intended recipients, can ensure that the 
final document communicates as 
intended. Use of anonymous quotations 
from interviews and documents makes a 
report highly vivid and meaningful to 
readers. 

The final report, once completed, 
should be distributed as negotiated in the 
original memorandum of understanding. 
Distribution is often accompanied by 
“meet the investigator” sessions that allow 
interested persons to ask the author of the 
report to expand or explain what has been 
written. 
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13.5.3.4 Subjectivist Data-Collection 

and Data-Analysis Methods 
What data-collection strategies are in the 
subjectivist researcher’s tool kit? There are 
several, and they are typically used in combi- 
nation. We shall discuss each one, assuming a 
typical setting for a subjectivist study in bio- 
medical informatics: the introduction of an 
information resource into patient care activi- 
ties in a hospital. 


= Observation 

The investigators typically immerse them- 
selves into the setting under study in one of 
two ways. The investigator may act purely 
as a detached observer, becoming a trusted 
and unobtrusive feature of the environment 
but not a participant in the day-to-day work 
and thus reliant on multiple “informants” as 
sources of information. True to the natural- 
istic feature of this kind of study, great care 
is taken to diminish the possibility that the 
presence of the observer will skew any work 
activities or that the observer will be rejected 
outright by the team. An alternative approach 
is participant observation, where the investi- 
gator becomes a member of the work team. 
Participant observation is more difficult to 
engineer; it may require the investigator to 
have specialized training in the study domain. 
It is time consuming but can give the investiga- 
tor a more vivid impression of life in the work 
environment. During both kinds of observa- 
tion, data accrue continuously. These data are 
qualitative and may be of several varieties: 
statements by health care providers, patients, 
family members, administrative staff, and oth- 
ers; gestures and other nonverbal expressions 
of these same individuals; and characteristics 
of the physical setting that seem to affect the 
delivery of health care. 


m Interviews 

Subjectivist studies rely heavily on interviews. 
Formal interviews are occasions where both 
the investigator and interviewee are aware that 
the answers to questions are being recorded 
(on paper or digitally) for direct contribution 
to the evaluation study. Formal interviews vary 
in their degree of structure. At one extreme 
is the unstructured interview, where there 
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are no predetermined questions. Between 
the extremes is the semi structured interview, 
where the investigator specifies in advance a 
set of topics that he/she would like to address- 
-but is flexible as to the order in which these 
topics are addressed, and is open to discus- 
sion of topics not on the pre-specified list. At 
the other extreme is the structured interview, 
with a schedule of questions that are always 
presented in the same words and in the same 
order. In general, the unstructured and semi 
structured interviews are preferred in subjec- 
tivist research. Informal interviews—sponta- 
neous discussions between the investigators 
and members of a team that occur during 
routine observation—are also part of the data 
collection process. Informal interviews are 
invariably considered a source of important 
data. Group interviews, akin to focus groups, 
may also be employed (e.g., Haddow et al. 
2011). Group interviews are very efficient 
ways to reach large numbers of participants, 
but investigators should not assume that indi- 
vidual participants will express in a group set- 
ting the same sentiments they will express if 
interviewed one-on-one. 

Sampling also enters into the interview 
process. There are usually more participants 
to interview than resources to conduct them. 
Unlike in objectivist studies, where random 
sampling is a form of gold standard to inform 
statistical attributions of effects, subjectiv- 
ist studies employ more purposeful sampling 
strategies. Investigators might actively seek 
interviewees they suspect to have unique or 
particularly insightful or influential opinions. 
They might remain in more frequent contact 
with key informants who, for various reasons, 
have the most insight into what is happening. 


a Document and Artifact Analysis 

Every project produces a trail of papers and 
other artifacts. These include patient charts, 
the various versions of an information 
resource and its documentation, memoranda 
prepared by the project team, perhaps a car- 
toon hung on an office door. Unlike the day- 
to-day events of health care, these artifacts 
do not change once created or introduced. 
With appropriate permissions negotiated in 
advance, they can be examined retrospectively 
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and referred to repeatedly, as necessary, over 
the course of a study. Also included under 
this heading are unobtrusive measures, which 
are the records accrued as part of the routine 
use of the information resource. They include, 
for example, user log files of an information 
resource. Data from these measures are often 
quantifiable and analyzed quantitatively even 
though the overall study design is qualitative 
in nature. 


= Anything Else That Seems Useful 
Subjectivist investigators are supreme oppor- 
tunists. As questions of importance to a study 
emerge, the investigators will collect any infor- 
mation that they perceive as bearing on these 
questions. This data collection could include 
clinical chart reviews, questionnaires, tests, 
simulated patients, and other methods more 
commonly associated with the objectivist 
approaches. 

When to end data collection is another 
challenge in otherwise open-ended subjectiv- 
ist studies. “Saturation” is important princi- 
ple to help investigators know when to stop. 
Stated simply, a data collection process is sat- 
urated when it becomes evident that, as more 
data are collected, no new findings or insights 
are emerging. 


a Analysis of Subjectivist Data 

There are many alternative procedures for 
analysis of qualitative data. In general terms, 
the investigator looks for insights, themes or 
trends emerging from several different sources. 
He/she collates individual statements and 
observations by theme, as well as by source. 
Investigators typically use software especially 
designed to facilitate analysis of qualitative 
data.° Because they allow electronic record- 
ing of the data while the investigator is “in the 
field”, tablets, smartphone Apps and other 


6 Examples include: Atlas.ti: » https://atlasti.com/ 
(Accessed November 18, 2019) and NVivo > https:// 
www.qsrinternational.com/nvivo/home (Accessed 
November 18, 2019). 


hand-held devices are changing the way sub- 
jectivist research is carried out. 

The subjectivist analysis process is fluid, 
with analytic goals shifting as the study 
matures. At an early stage, the goal is primar- 
ily to focus the questions that themselves will 
be the targets of further data elicitation. At 
the later stages of study, the primary goal 
is to organize data that address these ques- 
tions into specific themes, interpretations, and 
explanations. Conclusions derive credibility 
from a process of “triangulation”, which is the 
degree to which information from different 
independent sources generate the same theme 
or point to the same conclusion. Subjectivist 
analysis also employs a strategy known as 
“member checking” whereby investigators 
take preliminary conclusions back to the per- 
sons in the setting under study, asking if these 
conclusions make sense, and if not, why not. 
In subjectivist investigation, unlike objectivist 
studies, the agenda is never completely closed. 
The investigator is constantly on the alert for 
new information that can require a significant 
reorganization of the findings and conclu- 
sions that have been drawn to date. 


13.6 Communicating Evaluation 
Results 


Once any study, qualitative or quantitative, 
is complete, the results need to be commu- 
nicated to the stakeholders and others who 
might be interested. In many ways, commu- 
nication of evaluation results, a term we pre- 
fer over “reporting”, is the most challenging 
aspect of evaluation. Elementary theory tells 
us that, in general, successful communication 
requires a sender, one or more recipients, and 
a channel linking them, along with a mes- 
sage that travels along this channel (Ong and 
Coiera 2011). 

Seen from this perspective, successful 
communication of evaluation results is chal- 
lenging in several respects. It requires that 
the recipient of the message actually receive 
it. That is, for evaluations, the recipient must 
read the written report or attend the meet- 
ing intended to convey evaluation results. 
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For this reason, the investigator is invariably 
challenged to create a report the stakeholders 
will want to read or to choreograph a meeting 
they will be motivated to attend. Successful 
communication also requires that the recipi- 
ent understand the message, which challenges 
investigators to draft written documents at the 
right reading level, with audience-appropriate 
technical detail. Sometimes there must be sev- 
eral different forms of the written report to 
match several different audiences. Overall, we 
encourage investigators to recognize that their 
obligation to communicate does not end with 
the submission of a written document com- 
prising their technical evaluation report. The 
report is one channel for communication, not 
an end in itself. 

Depending on the nature, number, and 
location of the recipients—and permissions 
which have been obtained or written into 
evaluation agreements--many options exist 
for communicating the results of a study, 
including: 
= Written reports 

— Document(s) prepared for 

audience(s) 

— Internal newsletter article 

— Published journal article, with appro- 

priate permissions 

— Monograph, picture album, or book 
= One-to-one or small group meetings 

— With stakeholders or specific stake- 

holder groups 

— With the general public, if appropriate 
= Formal oral presentations 

— To groups of project stakeholders 

— Conference presentation with a poster 

or published paper in proceedings 

— To external meetings or seminars 
= Internet 
Project Web site or blog 
— Web “chat”, forum or Twitter feed to 
socialize results 
Online preprint 

— Internet based journal 
= Other 

— Video or podcast describing the study 

and information resource 

— Interview with a journalist on news- 

paper, TV, radio 


specific 
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A written, textual report is not the sole 
medium for communicating evaluation results. 
Verbal, graphical, or multimedia approaches 
can be helpful as ways to enhance communi- 
cation with specific audiences. Another useful 
strategy is to hold a “town meeting” to discuss 
a traditional written report after it has been 
released. Photographs or videos can portray 
the work setting for a study, the people in the 
setting, and the people using the resource. If 
appropriate permissions are obtained, these 
images—whether included as part of a writ- 
ten report, shown at a town meeting, or placed 
on a Web site—can be worth many thousands 
of words. The same may be true for recorded 
statements of resource users. If made avail- 
able, with permission, as part of a multime- 
dia report, the voices of the participants can 
convey a feeling behind the words that can 
enhance the credibility of the investigator’s 
conclusions (B Fig. 13.4). 

In addition to the varying formats for 
communication described above, investigators 
have other decisions to make after the data 
collection and analysis phases of a study are 
complete. One key decision is what personal 
role they will adopt after the formal investiga- 
tive aspects of the work are complete. They 
may elect only to communicate the results, but 
they may also choose to persuade stakehold- 
ers to take specific actions in response to the 
study results, and perhaps even assist in the 
implementation of these actions. This raises a 
key question: Is the role of an evaluator sim- 
ply to record and communicate study findings 
and then to move on to the next study, or is 
it also to engage with the study stakeholders 
and help them change how they work as a 
result of the study? 

To answer this question about the role of 
an evaluator, we need to understand that an 
evaluation study, particularly a successful one, 
has the potential to trigger a series of events, 
starting with the communication of study 
results, but then including interpretation, 
recommendation, and even implementation. 
Some evaluators—perhaps enthused by the 
clarity of their results and an opportunity to 
use them to improve health care, biomedical 
research, or education—prefer to go beyond 
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O Fig. 13.4 A picture is worth 1000 words: in the 
report of a study to establish the need for an electronic 
patient record, a casual photograph like this may prove 
much more persuasive than a table of data or para- 
graphs of prose 


reporting the results and conclusions to mak- 
ing recommendations. The dilemma often 
faced by evaluators is whether to retain their 
scientific detachment and merely report the 
study results, or to stay engaged somewhat 
longer. Investigators who choose to remain 
may become engaged in helping the stake- 
holders interpret what the results mean, guid- 
ing them in reaching decisions and perhaps 
even in implementing the actions decided 
upon. The longer they stay, the greater the 
extent to which evaluators must leave behind 
their scientific detachment and take on a 
role more commonly associated with change 
agents (Lunenburg 2010). Some confounding 
of these roles is inevitable when the evalua- 
tion is performed by individuals within the 
organization that developed the information 


resource under study. There is no hard-and- 
fast rule for deciding on the most appropriate 
role for the evaluator; the most important ini- 
tial realization for investigators is that the dif- 
ferent options exist and that a decision among 
them must inevitably be made. 


13.7 Conclusion: Evaluation 
as an Ethical and Scientific 
Imperative 


Evaluation takes place, either formally or infor- 
mally, throughout the resource development 
cycle: from defining the need to monitoring 
the continuing impact of a resource once it is 
deployed (Stead et al. 1994). We have seen in 
this chapter that different issues are explored, 
at different degrees of intensity, at each stage of 
resource development. For meaningful evalua- 
tion to occur, adequate amounts must be allo- 
cated for these studies when time and money are 
budgeted for a development effort. Evaluation 
cannot be left to the end of a project. While 
formal evaluations, as we have described them 
here, are still seen as optional for resources of 
the types that are the foci of biomedical and 
health informatics, the increasing complexity 
and prevalence of these resources have raised 
concerns about their safety and effectiveness 
when used in the real world (e.g., Koppel et al. 
2005). For the moment, we would argue that 
formal evaluations, using the range of methods 
described in this chapter, are mandated by the 
professional ethics of biomedical informatics as 
an applied scientific discipline (see > Chap. 12). 

Formal evaluations of biomedical infor- 
mation resources may someday be a statu- 
tory or regulatory requirement in many or all 
parts of the world, as they are already for new 
drugs or medical devices. If and when that 
day comes, the wide variety of questions to 
be addressed and the diversity of legitimate 
methods available to address those questions, 
as described in this chapter, will make it dif- 
ficult to describe with exactitude how these 
studies should be done. There have been some 
published academic checklists or guidelines 
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describing things to study and report in such 
studies (Talmon et al. 2009), but this is a 
bridge to be crossed in the future. We express 
the hope that writers of such guidelines and 
regulations will not overprescribe the methods 
to be used, while insisting on rigor in draw- 
ing conclusions from data collected using 
study designs thoughtfully matched to care- 
fully identified questions. We hope the reader 
has learned from this chapter that rigor in 
evaluation is achievable in many ways, that 
information resources raise unique challenges 
when they are the foci for evaluation, and that 
overly rigid prescription of evaluation meth- 
ods, however well intentioned, could defeat 
their well-intentioned purpose. However, it 
is also clear that the intensity of the evalua- 
tion effort should be closely matched to the 
resource’s maturity (Stead et al. 1994). The 
UK Medical Research Council’s Framework 
for Complex Interventions (Campbell et al. 
2000), or a more recent variation intended 
for digital interventions (Murray et al. 2016) 
point out that it is unwise to conduct an 
expensive user-effect field trial of an informa- 
tion resource that is barely complete, is still 
in prototype form, may evolve considerably 
before taking its final shape, or is so early in its 
development that it may fail because program- 
ming bugs have not been eliminated. 

We believe that readers of this chapter will 
to varying degrees be critical appraisers of, 
participants in, and/or conductors of evalu- 
ation studies. In playing any or all of these 
roles, it is important to recognize that evalu- 
ation sits at the junction where the art of the 
possible, given the complexity of informat- 
ics interventions, meets the rigor of scientific 
method drawn from the objectivist and sub- 
jectivist traditions. 
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Appendices 


Appendix A: Two Evaluation 
Scenarios 


Here we introduce two scenarios that collec- 
tively capture many of the dilemmas facing 
those planning and conducting evaluations in 
biomedical informatics: 

1. A prototype information resource has 
been developed, but its usability and 
potential for benefit need to be assessed 
prior to deployment; 

2. A commercial resource has been deployed 
across a large enterprise, and there is need 
to understand its impact on users as well 
as on the organization. 


These scenarios do not address the full scope 
of evaluations in biomedical informatics, but 
they cover a lot of what people do. For each, 
we introduce sets of evaluation questions that 
frequently arise and examine the dilemmas 
that investigators face in the design and execu- 
tion of evaluation studies. 


= Scenario 1: A Prototype Information 
Resource Has Been Developed, but Its 
Usability and Potential for Benefit Need 
to Be Assessed Prior to Deployment 
The primary evaluation issue here is the 
upcoming decision to continue with the 
development of the prototype informa- 
tion resource. Validation of the design and 
structure of the resource will have been con- 
ducted, either formally or informally, but not 
yet a usability study. If this looks promising, 
a laboratory evaluation of key functions is 
also advised before making the substantial 
investment required to turn a promising 
prototype into a system that is stable and 
likely to bring more benefits than problems 
to users in the field. Here, typical questions 
will include: 
= Who are the target users, and what are 
their background skills and knowledge? 
= Does the resource make sense to target 
users? 
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= Following a brief introduction, can 
target users navigate themselves around 
important parts of the resource? 

= Can target users carry out a selection 
of relevant tasks using the resource, in 
reasonable time and with reasonable 
accuracy? 

= What user characteristics correlate with 
the ability to use the resource and achieve 
fast, accurate performance with it? 

= What other kinds of people can use it 
safely? 

= How to improve the layout, design, 
wording, menus etc. 

= Is there a long learning curve? What user 
training needs are there? 

= How much on-going help will users 
require once they are initially trained? 

= What concerns do users have about the 
system — e.g., accuracy, privacy, effect on 
their jobs, other side effects 

= Based on the performance of prototypes 
in users’ hands, does the resource have the 
potential to meet user needs? 


These questions fall within the scope of 

the usability and laboratory function test- 

ing approaches listed in @ Table 15.1. A 

wide range of techniques—borrowed from 

the human-computer interaction field and 

employing both objectivist and subjectivist 

approaches-can be used, including: 

= Seeking the views of potential users after 
both a demonstration of the resource and 
a hands-on exploration. Methods such as 
focus groups may be very useful to 
identify not only immediate problems 
with the software and how it might be 
improved, but also potential broader 
concerns and unexpected issues that may 
include user privacy and long term issues 
around user training and working 
relationships. 

== Studying users while they carry out a list 
of pre-designed tasks using the information 
resource. Methods for studying users 
includes watching over their shoulder, 
video observation (sometimes with several 
video cameras per user); think aloud 


protocols (asking the user to verbalize 
their impressions as they navigate and use 
the system); and automatic logging of 
keystrokes, navigation paths, and time to 
complete tasks. 

= Use of validated questionnaires to capture 
user impressions, often before and after an 
experience with the system, one example 
being the Telemedicine Preparedness 
questionnaire (Demiris et al. 2000). 

= Specific techniques to explore how users 
might improve the layout or design of the 
software. For example, to help understand 
what users think of as a “logical” menu 
structure for an information resource, 
investigators can use a card sorting 
technique. This entails listing each function 
available on all the menus on a separate 
card and then asking users to sort these 
cards into several piles according to which 
function seems to go with which [> www. 
useit.com]. 


Depending on the aim of a usability study, 
it may suffice to employ a small number 
of potential users. Nielsen has shown that, 
if the aim is to identify only major soft- 
ware faults, the proportion identified rises 
quickly up to about 5 or 6 users then much 
more slowly to plateau at about 15-20 users 
(Nielsen 1994). Five users will often identify 
80% of software problems. However, investi- 
gators conducting such small studies, useful 
though they may be for software develop- 
ment, cannot then expect to publish them 
in a scientific journal. The achievement in 
this case is having found answers to a very 
specific question about a specific software 
prototype. This kind of local reality test is 
unlikely to appeal to the editors or readers 
of a journal. By contrast, the results of for- 
mal laboratory function studies, that typi- 
cally employ more users, are more amenable 
to journal publication. 


a Scenario 2: A Commercial Resource Has 
Been Deployed Across a Large Enterprise, 
and There Is Need to Understand its 
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Impact on Users as Well as on the Organi- 
zation 

The type of evaluation questions that arise 

here include: 

= In what fraction of occasions when the 
resource could have been used, was it 
actually used? 

= Who uses it, why, are these the intended 
users, and are they satisfied with it? 

= Does using the resource improve influence 
information/communication flows? 

= Does using the resource influence their 
knowledge or skills? 

= Does using the resource improve their 
work? 

= For clinical information resources, does 
using the resource change outcomes for 
patients? 

= How does the resource influence the whole 
organization and relevant sub units? 

= Do the overall benefits and costs or risks 
differ for specific groups of users, 
departments, the whole organization? 

= How much does the resource really cost 
the organization? 

= Should the organization keep the resource 
as it is, improve it or replace it? 

= How can the resource be improved, at 
what cost, and what benefits would result? 


To each of the above questions, one can add: 
“Why, or why not?”, to get a broader under- 
standing of what is happening as a result of 
use of the resource. 

This evaluation scenario, suggesting a 
problem impact study, is often what people 
think of first when the concept of evalua- 
tion is introduced. However, we have seen in 
this chapter that it is one of many evalua- 
tion scenarios, arising relatively late in the life 
cycle of an information resource. When these 
impact-oriented evaluations are undertaken, 
they usually result from a realization by stake- 
holders, who have invested significantly in an 
information resource, that the benefits of the 
resource are uncertain and there is need to jus- 
tify recurring costs. These stakeholders usu- 
ally vary in the kind of evaluation methods 
that will convince them about the impacts that 
the resource is or is not having. Many such 
stakeholders will wish to see quantified indi- 
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ces of benefits or harms from the resource, for 
example the number of users and daily uses, 
the amount the resource improves productiv- 
ity or reduces costs, or perhaps other benefits 
such as reduced waiting times to perform key 
tasks or procedures, lengths of hospital stay 
or occurrence of adverse events. Such data are 
collected through objectivist studies as dis- 
cussed earlier. Other stakeholders may prefer 
to see evidence of perceived benefit and posi- 
tive views of staff, in which case staff surveys, 
focus groups and unstructured interviews may 
prove the best evaluation methods. Often, a 
combination of many methods is necessary 
to extend the investigation from understand- 
ing what impact the resource has to why this 
impact occurs - or fails to occur. 

If the investigator is pursuing objectiv- 
ist methods, deciding which of the possible 
effect variables to include in an impact study 
and developing ways to measure them can be 
the most challenging aspect of an evaluation 
study design. (These and related issues receive 
the attention of five full chapters of a textbook 
by the authors of this chapter (Friedman and 
Wyatt 2005).) Investigators usually wish to 
limit the number of effect measures employed 
in a study for many reasons: limited evalua- 
tion resources, to minimize manipulation of 
the practice environment, and to avoid sta- 
tistical analytical problems that result from a 
large number of measures. 

Effect or impact studies can also use sub- 
jectivist approaches to allow the most relevant 
“effect” issues to emerge over time and with 
increasingly deep immersion into the study 
environment. This emergent feature of sub- 
jectivist work obviates the need to decide in 
advance which effect variables to explore, and 
is considered by proponents of subjectivist 
approaches to be among their major advan- 
tages. 

In health care particularly, every interven- 
tion carries some risk, which must be judged 
in comparison to the risks of doing nothing 
or of providing an alternative intervention. It 
is difficult to decide whether an information 
resource is an improvement unless the perfor- 
mance of the current decision-takers is also 
measured in a comparison-based evaluation. 
For example, if physicians’ decisions are to 
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become more accurate following introduction 
of a decision-support tool, the resource needs 
to be “right” when the user would usually 
be “wrong” This could mean that the tool’s 
error rate is lower than that of the physician, 
or its errors are in different cases, or they 
should be of a different kind or less serious 
than those of the clinician, so as not to intro- 
duce new errors caused by the clinician fol- 
lowing resource advice even when that advice 
is incorrect — “automation bias” (Goddard 
et al. 2012). 

For effect studies, it is often important to 
know something about how the practitioners 
carry out their work prior to the introduction 
of the information resource. Suitable measures 
include the accuracy, timing, and confidence 
level of their decisions and the amount of 
information they require before making a deci- 
sion. Although data for such a study can some- 
times be collected by using abstracts of cases 
or problems in a laboratory setting (@ Fig. 
15.2), these studies inevitably raise questions 
of generalization to the real world. We observe 
here one of many trade-offs that occur in the 
design of evaluation studies. Although control 
over the mix of cases possible in a laboratory 
study can lead to a more precise estimate of 
practitioner decision making, ultimately it 
may prove better to conduct a baseline study 
while the individuals are doing real work in a 
real practice setting. Often this audit of current 
decisions and actions provides useful input to 
the design of the information resource, and a 
reference against which resource performance 
may later be compared. 

When conducting problem impact stud- 
ies in health care settings, investigators can 
sometimes save themselves much time and 
effort without sacrificing validity by measur- 
ing effect in terms of certain health care pro- 
cesses rather than patient outcomes, in other 
words by employing a user effect study as a 
proxy for a problem impact study. For exam- 
ple, measuring the mortality or complication 
rate in patients with heart attacks requires 
data collection from hundreds of patients, as 
complications and death are (fortunately) rare 
events. However, as long as large, rigorous 
trials or meta-analyses have determined that 
a certain procedure (e.g., giving heart attack 


patients streptokinase within 24 h) correlates 
closely with the desired patient outcome, it is 
perfectly valid to measure the rate of perform- 
ing this procedure as a valid “surrogate” for 
the desired outcome. Mant and Hicks dem- 
onstrated that measuring the quality of care 
by quantifying a key process in this way may 
require one tenth as many patients as measur- 
ing outcomes (Mant and Hicks 1995). 


Appendix B: Exemplary Evaluation 
Studies 


In this appendix, we briefly summarize stud- 
ies that align with many of the study types 
described in © Tables 13.1 and 13.2. 


Usability Study Assessing Performance of an 
Electronic Health Record Using Cognitive Task 
Analysis. 

Saitwal et al. (2010) is a pure usability test- 
ing study that evaluates the Armed Forces 
Health Longitudinal Technology Application 
EHR using a cognitive task analysis approach, 
referred to as Goals, Operators, Methods, and 
Selection rules (GOMS). Specifically, authors 
evaluated the system response time and the 
complexity of the graphical user interface 
(GUI) when completing a set of 14 prototypi- 
cal tasks using the EHR. Authors paid spe- 
cial attention to inter-rater reliability of the 
two evaluators using GOMS to analyze the 
GUI of the system through task completion. 
Each task was broken down into a series of 
steps, with the intent to determine the per- 
cent of steps classified as “mental operators”. 
Execution time was then calculated for each 
step and summed to obtain a total time for 
task completion. 


Lab Function Study Diagnostic inaccuracy of 
smartphone applications for melanoma 
detection. 

Wolf et al. (2013) conducted an evaluation 
study of smartphone applications capable 
of detecting melanoma and sought to deter- 
mine the diagnostic inaccuracy. The study is 
exemplary of a lab function study and com- 
plements the Beaudoin et al. (2016) study 
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described below because study authors paid 
special attention to measuring application 
function in a lab setting using digital clinical 
images with a previous diagnosis obtained via 
histologic analysis by a dermatopathologist. 
Authors employed a comparative analysis 
between four different smartphone applica- 
tions and assessed the sensitivity, positive pre- 
dictive value, and negative predictive value of 
each application compared to histologic diag- 
nosis. Rather than focus on the function in a 
real health care setting with real users, authors 
were interested in facilitating decision-making 
as to which applications performed best under 
controlled conditions. 


Field Function Study Evaluation of a machine 
learning capability for a clinical decision support 
system to enhance antimicrobial stewardship 
programs. 

Beaudoin et al. (2016) conducted an 
observational study to evaluate the func- 
tion of a combined clinical decision support 
system (antimicrobial prescription surveil- 
lance system (APSS)) and a learning module 
for antimicrobial stewardship pharmacists 
in a Canadian university hospital system. 
Authors developed a rule-based machine 
learning module designed from expert phar- 
macist recommendations which triggers alerts 
for inappropriate prescribing of piperacil- 
lin-tazobactam. The combined system was 
deployed to pharmacists and outputs were 
studied prospectively over a five-week period 
within the hospital system. Analyses assessed 
accuracy, positive predictive value, and sensi- 
tivity of the combined system, the individual 
learning module, and the APSS compared 
to the pharmacist opinion. This is an exem- 
plary field function study because authors are 
evaluating the ability of the combined rule- 
based learning module and APSS to detect 
inappropriate prescribing in the field with real 
patients. 


Lab User Effect Study Applying human factors 
principles to alert design increases efficiency and 
reduces prescribing errors in a scenario-based 
simulation. 

Russ et al. (2014) describe a study evaluat- 
ing the redesign of alerts using human factors 
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principles and their influence on prescribing 
by providers. The study is exemplary of a 
lab user effect study because it analyzed fre- 
quency of prescribing errors by providers, 
and it was conducted in a simulated environ- 
ment (the Human-Computer Interaction and 
Simulation Laboratory in a Veterans Affairs 
Medical Center). Authors were particularly 
interested in three types of alerts: drug-drug 
interactions, drug-allergy, and drug disease. 
Three scenarios were developed for this study 
that included 19 possible alerts. These alerts 
were intended to be familiar and unfamil- 
iar to prescribers. Authors used a crossover 
design with a two-week “washout period” 
for participants to complete both original 
and redesigned alerts to reduce contamina- 
tion in repeated measures. Special attention 
was paid to a repeated measures comparative 
analysis of the influence of original versus 
redesigned alerts on outcomes of perceived 
workload and prescribing errors. Authors 
also employed elements of usability testing 
during this study, such as assessing learn- 
ability, efficiency, satisfaction and usability 
errors. 


Field User Effect Study Reminders to physicians 
from an introspective computer medical record: 
A two-year randomized trial. 

McDonald et al. (1984) conducted a two- 
year randomized controlled trial to evaluate 
the effects of a computer-stored medical 
record system which reminds physicians 
about actions needed for patients prior to a 
patient encounter. This study most closely 
aligns with a field user effect study for the 
attention to behavior change in preven- 
tive care delivery associated with use of 
the information resource, and is exemplary 
because its rigorous design accounts for 
the hierarchical nature of clinicians work- 
ing in teams without having to manipulate 
the practice environment. Randomization 
occurs at the clustered team level and analy- 
ses were performed at both the cluster and 
individual levels. The study did include 
problem impact metrics, however no sig- 
nificant changes were observed in these out- 
comes during the study. 
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Field User Effect Study Electronic health 
records and health care quality over time in a fed- 
erally qualified health center. 

Kern et al. (2015) conducted a three-year 
comparative study across six sites of a fed- 
erally qualified health center in New York 
to analyze the association between post- 
implementation of an electronic health record 
(EHR) and quality of care delivery as mea- 
sured by change in compliance with Stage 1 
Meaningful Use quality measures. This study 
is an exemplary field user effect study for its 
attention to measures of clinician behav- 
ior in care delivery through test/screening 
ordering using the EHR and explicit use of 
statistical analysis techniques to account for 
repeated measures on patients over time. The 
study also includes two problem impact met- 
rics (change in HbAlc and LDL cholesterol) 
analyzed over the study period; however, the 
study intent was primarily focused on clini- 
cian ordering behavior. 


Problem Impact Study Effects of a mobile 
phone short message service on antiretroviral 
treatment adherence in Kenya ( WelTel Kenyal ): 
A randomised trial. 

Lester et al. (2010) is an exemplar for 
problem impact studies. Authors conducted 
a randomized controlled trial to measure 
improvement in patient adherence to anti- 
retroviral therapy (ART) and suppression of 
viral load following receipt of mobile phone 
communications with health care workers. The 
study randomized patients to the intervention 
group (receiving mobile phone messages from 
healthcare workers) or to the control group 
(standard care). Outcomes were clearly identi- 
fied and focused on behavioral effects (drug 
adherence) and an overall intent to measure 
the extent that improvements in adherence 
influenced patient health status (viral load). 
The special attention to randomization and 
use of effect size metrics for analysis are 
critical components to measuring the overall 
impact of mobile phone communications on 
patient health. 


(e) Suggested Reading 
Ammenwerth, E., & Rigby, M. (Eds.). (2016). 
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Amsterdam: IOS Press. This work includes an 
extensive exploration of evaluation methods 
pertinent to health informatics. 

Anderson, J. G., & Aydin, C. E. (2005). Evaluating 
the organizational impact of health care infor- 
mation systems. New York: Springer. This is 
an excellent edited volume that covers a wide 
range of methodological and substantive 
approaches to evaluation in informatics. 

Brender, J. (2006). Handbook for evaluation for 
health informatics. Burlington: Elsevier 
Academic Press. Along with the Friedman and 
Wyatt text cited below, one of few textbooks 
available that focuses on evaluation in health 
informatics. 

Cohen, P. R. (1995). Empirical methods for artifi- 
cial intelligence. Cambridge, MA: MIT Press. 
This is a nicely written, detailed book that is 
focused on evaluation of artificial intelligence 
applications, not necessarily those operating 
in medical domains. It emphasizes objectivist 
methods and could serve as a basic statistics 
course for computer science students. 

Fink, A. (2004). Evaluation fundamentals: Insights 
into the outcomes, effectiveness, and quality of 
health programs (2nd ed.). Thousand Oaks: 
Sage Publications. A popular text that dis- 
cusses evaluation in the general domain of 
health. 

Friedman, C. P., & Wyatt, J. C. (2006). Evaluation 
methods in biomedical informatics. New York: 
Springer. This is the book on which the cur- 
rent chapter is based. It offers expanded dis- 
cussion of almost all issues and concepts 
raised in the current chapter. 

Jain, R. (1991). The art of computer systems per- 
formance analysis: Techniques for experimental 
design, measurement, simulation, and model- 
ling. New York: Wiley. This work offers a tech- 
nical discussion of a range of objectivist 
methods used to study computer systems. The 
scope is broader than Cohen’s book (1995) 
described earlier. It contains many case stud- 
ies and examples and assumes knowledge of 
basic statistics. 

Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic 
inquiry. Thousand Oaks: Sage Publications. 
This is a classic book on subjectivist methods. 
The work is very rigorous but also very easy to 
read. Because it does not focus on medical 
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domains or information systems, readers must 
make their own extrapolations. 

Rossi, P. H., Lipsey, M. W., & Freeman, H. E. 
(2004). Evaluation: A systematic approach (Tth 
ed.). Thousand Oaks: Sage Publications. This 
is a valuable textbook on evaluation, empha- 
sizing objectivist methods, and is very well 
written. It is generic in scope, and the reader 
must relate the content to biomedical infor- 
matics. There are several excellent chapters 
addressing pragmatic issues of evaluation. 
These nicely complement the chapters on sta- 
tistics and formal study designs. 


® Questions for Discussion 
1. Associate each of the following hypo- 
thetical evaluation scenarios with one 
or more of the nine types of studies 
listed in @ Table 13.1. Note that some 
scenarios may include more than one 
type of study. 

(a) An order communication system 
is implemented in a small hospital. 
Changes in laboratory workload 
are assessed. 

(b) The developers of the order commu- 
nication system recruit five potential 
users to help them assess how read- 
ily each of the main functions can be 
accessed from the opening screen and 
how long it takes users to complete 
them. 

(c) A study team performs a thorough 
analysis of the information required 
by psychiatrists to whom patients 
are referred by a community social 
worker. 

(d) A biomedical informatics expert is 
asked for her opinion about a PhD 
project on a new bioinformatics 
algorithm. She requests copies of 
the student’s code and documenta- 
tion for review. 

(e) A new intensive care unit system 
is implemented alongside manual 
paper charting for a month. At the 
end of this time, the quality of the 
computer-derived data and data 
recorded on the paper charts is com- 
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pared. A panel of intensive care 
experts is asked to identify, inde- 
pendently, episodes of hypotension 
from each data set. 

(f) A biomedical informatics professor 
is invited to join the steering group 
for a series of apps to support peo- 
ple living with diabetes. The only 
documentation available to critique 
at the first meeting is a statement of 
the project goal, description of the 
planned development method, and 
the advertisements and job descrip- 
tions for team members. 

(g) Developers invite educationalists to 
test a prototype of a computer-aided 
learning system as part of a user- 
centered design workshop 

(h) A program is devised that generates 
a predicted 24-h blood glucose pro- 
file using seven clinical parameters. 
Another program uses this profile and 
other patient data to advise on insu- 
lin dosages. Diabetologists are asked 
to prescribe insulin for a series of 
“paper patients” given the 24-h profile 
alone, and then again after seeing the 
computer-generated advice. They are 
also asked their opinion of the advice. 

(i) A program to generate alerts to pre- 
vent drug interactions is installed in 
a geriatric clinic that already has a 
computer-based medical record 
system. Rates of clinically signifi- 
cant drug interactions are com- 
pared before and after installation 
of the alerting program. 


Choose any alternative area of bio- 
medicine (e.g., drug trials) as a point of 
comparison, and list at least four fac- 
tors that make studies in biomedical 
informatics more difficult to conduct 
successfully than in that area. Given 
these difficulties, discuss whether it 
is worthwhile to conduct empirical 
studies in biomedical informatics or 
whether we should use intuition or the 
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marketplace as the primary indicators 
of the value of an information resource. 

3. Assume that you run a philanthropic 
organization that supports biomedi- 
cal informatics. In investing the scarce 
resources of your organization, you have 
to choose between funding a new system 
or resource development, or funding 
empirical studies of resources already 
developed. What would you choose? 
How would you justify your decision? 

4. To what extent is it possible to be cer- 
tain how effective a medical informatics 
resource really is? What are the most 
important criteria of effectiveness? 

5. Do you believe that independent, unbi- 
ased observers of the same behavior or 
outcome should agree on the quality of 
that outcome? 

6. Many of the evaluation approaches 
assert that a single unbiased observer 
is a legitimate source of information 
in an evaluation, even if that observ- 
er’s data or judgments are unsubstan- 
tiated by other people. Give examples 
drawn from our society where we vest 
important decisions in a single expe- 
rienced and presumed impartial indi- 
vidual. 

7. Do you agree with the statement that all 
evaluations appear equivocal when sub- 
jected to serious scrutiny? Explain your 
answer. 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What is the definition of an EHR? 

= What are the functional components of 
an EHR? 
What are the benefits of an EHR? 
What are the some of the impediments 
to development, configuration, and use 
of an EHR? 


14.1 What Is an Electronic Health 
Record? 


The preceding chapters introduced the con- 
ceptual basis for the field of biomedical infor- 
matics, including the use of patient data in 
clinical practice and research. The chapters in 
this section cover the various technologies, 
systems, and approaches of biomedical infor- 
matics in practice. This chapter focuses on the 
patient record and associated systems, com- 
monly referred to as the patient’s chart, medi- 
cal record, or health record. In particular, we 
define and examine the use of electronic 
health records (EHRs),! discuss their purpose 
and functional components, potential benefits 
and costs, and describe current challenges and 
opportunities in their dissemination, optimal 
use, and innovation. 


1 The terms “electronic health record” (EHR) and 
“electronic health record system” (EHRS or EHR 
system) are often used interchangeably, with no gen- 
erally agreed upon distinction. The terms “elec- 
tronic medical record” (EMR) and “electronic 
medical record system” (EMRS or EMR system) 
have also been used but many have moved towards 
using the term “health” versus “medical” as these 
systems are being used broadly across the contin- 
uum of care by a multitude of roles; again, the dis- 
tinctions between EMR and EMRS are not 
generally agreed upon. The term “computer-based 
record system” has also been used in the past. In this 
chapter, and throughout the book, we will use the 
term “electronic health record”, with the acronyms 
“EHR” and “EHRs” to designate the singular and 
plural forms, respectively. 


14.1.1 Purpose of a Patient Record 


Stanley Reiser (1991) wrote that the purpose 
of a patient record is “to recall observations, 
to inform others, to instruct students, to gain 
knowledge, to monitor performance, and to 
justify interventions.” The many uses 
described in this statement, although diverse, 
have a single goal—to leverage patient data 
and information within the record to care for 
patients and to further health sciences, includ- 
ing the conduct of research and public health 
activities that address population health. 
Traditionally, the patient record was on paper 
and almost exclusively used by providers, 
nurses, and other care team members to docu- 
ment and facilitate the care of patients. The 
paper record contained clinical notes docu- 
menting assessments, decision-making, and 
care rendered; paper charts also often con- 
tained diagnostic results (e.g., laboratory and 
imaging results); physiologic information 
(e.g., vital signs); and patient care orders. With 
the increased digitization of health care, mod- 
ern EHRs have become increasingly ubiqui- 
tous; they are designed not only for patient 
care but also to facilitate broader value-added 
functions and views of patient information, 
providing much more than a static view of 
events. 


14.1.2 EHR Overview 


An EHR is an electronic repository of main- 
tained information about an individual’s 
health status and health care with functional- 
ity to enable provision of care and informa- 
tion stored in order to serve the multiple 
legitimate uses and users of the record. While 
traditional patient records have been illness- 
focused, health care is evolving to encourage 
health care providers to focus on the contin- 
uum of health and health care from wellness 
to illness and recovery. 

As a result of this shift and increasing sys- 
tem inter-operability of EHRs, many antici- 
pate that EHRs will increasingly carry a much 
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greater portion of a person’ health-related 
information from a wide range of sources 
over their lifetime (e.g., diagnostic images, 
intensive care electrophysiologic monitoring, 
patient recorded biometrics, genomic infor- 
mation). Today, the Department of Veterans 
Affairs (VA) has already committed to keep- 
ing existing electronic patient data for 75 years. 
In many cases, EHRs incorporate or integrate 
with a range of additional multimedia sources, 
such as radiology images and echocardio- 
graphic video loops. 

EHRs also often include active tools used 
to manage patient with a wide range of fea- 
tures. This includes information manage- 
ment tools to provide clinical reminders and 
alerts, dynamic tracking and trending which 
can often be personalized, linkages with 
knowledge sources for health care clinical 
decision support (CDS; see » Chap. 26), 
and analysis of aggregate data both for care 
management and for research. The EHR 
also helps users organize, interpret, and 
react to patient data. Examples of tools pro- 
vided in current EHRs are discussed in 
> Sect. 14.3. As such, EHRs need to be able 
to provide different views and presentations 
of patient data to meet the needs of various 
user types and patient care contexts, along 
with associated functionality to serve patient 
care and various secondary EHR uses as 
described in > Chap. 2. 

EHRs can also analyze a patient’s record, 
call attention to trends and dangerous condi- 
tions, and suggest corrective actions much like 
an airplane flight control information system. 
Another powerful aspect of EHRs is their 
ability to organize data at both the individual 
and population level such as a view to facili- 
tate care for one patient or one for a popula- 
tion of patients to assist with care management 
decisions or answer epidemiologic questions. 

One advantage of EHRs is the availability 
of information entry controls and capabilities. 
In addition to increased legibility compared 
to paper, EHRs can also increase the quality 
of data by applying validity checks as data is 
being entered like typographical errors checks 
and other checks (e.g., dosing ranges for med- 
ications). EHRs can require data entry in 
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specified fields, conditional on the value of 
other fields. As such, EHRs not only store 
data, but can also conditionally enforce the 
capture of certain data elements. This enforce- 
ment power should be used judiciously, how- 
ever, and not require the entry of unavailable 
data (e.g., the age of an unidentified patient 
receiving care during an emergency trauma) 
especially during order entry, and potentially 
prevent the clinician from completing an 
important order needed for clinical care 
(Strom et al. 2010). 

The degree to which a particular EHR 
achieves its intended value depends on several 
factors: 


Comprehensiveness of information Does the 
system include information from all organiza- 
tions and clinicians who participated in a 
patient’s care and from all settings where care 
was delivered (e.g., office practice, hospital, 
homecare, care coordination, virtual care)? 
Does it include the full spectrum of clinical 
data, including clinical notes, laboratory test 
results, medication details, images, and patient 
reported outcomes, including those collected 
by validate patient _self-assessments. 
Increasingly, genetic data (including both 
germline and somatic tumor data) will become 
key to clinical care and EHRs. Incorporation 
of genetic data (including its interpretation(s) 
and analysis model(s)) will bring important 
data management and knowledge management 
issues including data storage (e.g., 1100 patients 
for 1 year requiring 2 terabytes (Burykin 2011)) 
and evolving interpretation(s) of genetic data 
and computational analyses of this data (see 
> Chap. 30). 


Duration of use and retention of data EHRs 
gain value over time through accumulation of a 
greater proportion of the patients’ medical his- 
tory. A record system with 5 years of patient 
data will be more valuable than one with only 
the last month’s records. Retention of medical 
records follow, at minimum, the statute of limi- 
tations for medical malpractice, which are 
state-based laws. EHR records can be archived 
on various systems or maintained for ongoing 
access. 
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Degree of structure of data Narrative notes 
stored in EHRs have the advantage over their 
paper predecessors as they can be searched by 
word or patterns. The success of such searches 
is limited by the sophistication of the EHR’s 
text mining and search capabilities along with 
the quality of the user’s search criteria particu- 
larly since there can often be language variabil- 
ity in expression of medical terms including the 
use of abbreviations. 

A great deal of valuable information is 
contained in clinical notes mostly recorded in 
narrative. Though often time consuming and 
cumbersome, one way to obtain structured 
data with clinical notes is to ask the clinician 
to enter key information through structured 
forms which restrict data entry with a con- 
trolled vocabulary that is standards-based. 
Powerful automated techniques like natural 
language processing can also be increasingly 
leveraged to extract clinical data from notes 
(see > Chap. 9). 


Ubiquity of access With today’s secure net- 
works and other distributed technologies, cli- 
nicians and patients can access a patient’s 
EHR from geographically distributed sites. 
Paper records have significant inaccessibility 
issues as they can only be in one place and with 
at most one user at any point in time. Previously, 
completing discharge summaries and signing 
orders with paper records or borrowing records 
for administrative or research purposes from 
medical record departments was logistically 
challenging. In contrast, EHRs are ubiq- 
uitously available to users for these and other 
purposes. 

In some cases, the collective data about 
one patient from independent care systems 
can also be accessed through health informa- 
tion exchanges (see > Chap. 17). Such avail- 
ability can also support health care continuity 
during disasters. Brown et al. (2007) found a 
“stark contrast” between the care VA versus 
non-VA patients received after Hurricane 
Katrina, because appropriate and uninter- 
rupted care were supported by nationwide 
access to the comprehensive VA EHRs. EHRs 
not only make data more accessible to autho- 


rized users, but they also provide the benefit 
of greater control over data and user access 
and improved enforcement of applicable pri- 
vacy regulations as required by the Health 
Insurance Portability and Accountability Act 
(HIPAA) (see > Chap. 31). 

System challenges can be experienced 
from several perspectives: 


Challenges with system use and train- 
ing Physicians and other key personnel have 
to take time from their work to learn how to 
use the system. Furthermore, clinical work- 
flows have to be re-designed in order to utilize 
the EHR effectively. It is increasingly appreci- 
ated that EHR usability needs to be improved 
so that clinicians can easily document and 
provide patient care. 


Readability of clinical notes One of the advan- 
tages of EHRs is the ease with which text can 
be entered compared to paper notes. 
Unfortunately, this can result in something 
referred to as “note bloat”, resulting from cut- 
ting and pasting text and inserting other results 
(> Sect. 14.3.1.2), resulting in voluminous 
documentation compared to paper records, 
making EHR notes often more difficult to 
review efficiently. 


System failures and ensuring adequate redun- 
dancy and security Computer-based systems 
have the potential for catastrophic failures that 
could cause extended unavailability of patients’ 
computer records. To combat this, EHRs today 
often run across distributed technologies offer- 
ing redundancy such as parallel operation at 
geographically separate sites and hot fail over 
with separate computer systems running syn- 
chronously with the primary system that can 
take over near instantaneously from the pri- 
mary system if it were to fail. Yet, nothing pro- 
vides complete protection; contingency plans 
with downtime procedures must be developed 
for handling brief (even planned) or longer sys- 
tem outages. Also, cybersecurity is increasingly 
a concern that health care systems are working 
to improve as attacks are of increasing sophis- 
tication and frequency (see > Chap. 18). 
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14.2 Historical Perspective: 
Development of EHRs 


The initial development of automated sys- 
tems in health care was stimulated by regula- 
tory and reimbursement requirements. Early 
health care systems in the inpatient setting 
provided charge capture functionality to meet 
billing requirements in a fee-for-service 
environment. 

The Flexner report on medical education 
was the first formal statement made about the 
function and contents of the medical record 
(Flexner 1910). In advocating a scientific 
approach to medical education, the Flexner 
report also encouraged physicians to keep a 
patient-oriented medical record. Three years 
earlier, Dr. Henry Plummer initiated the “unit 
record” for the Mayo Clinic (including its St. 
Mary’s Hospital), placing all the patient’s vis- 
its and types of information in a single folder. 
This innovation represented the first longitu- 
dinal medical record (Melton 3rd 1996). The 
Presbyterian Hospital (New York) adopted 
the unit record for its inpatient and outpatient 
care in 1916, studying the effect of the unit 
record on length of stay and quality of care 
(Openchowski 1925) and writing a series of 
letters and books about the unit record that 
disseminated the approach around the nation 
(Lamb 1955). 

The first record we could find of a 
computer-based medical record was a short 
newspaper article describing a new “electronic 
brain” — to replace punched and file index 
cards and to track hospital and medical 
records by the Michigan Hospital Service 
(Brain 1956). The first, operational EHRs 
emerged in the early 1970’s. Some started an 
out patient systems, including Costar from 
Massachusetts General Hospital (Grossman 
et al. 1973; Barnett et al. 1979; Barnett 1984), 
RMRS from the Regenstrief Institute 
(McDonald 1973; McDonald et al. 1975; 
McDonald et al. 1999), Duke University 
(Stead 1977; Stead & Hammond 1988), STOR 
(Simborg and Whiting-O’Keefe III 1981), and 
others (see outpatient EHR review by Kuhn 
et al. 1984). Other systems began on the inpa- 
tient side including HELP (Warner 1972) and 
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Lockheed’s hospital information system (HIS) 
at El Camino Hospital, which became opera- 
tional in 1971 (Coffey 1979). 

Weed’s problem-oriented medical record 
book (POMR) (1968) shaped medical think- 
ing about both manual and automated medi- 
cal records. His computer-based inpatient 
system followed (Schultz et al. 1971). Morris 
Collen, who also pioneered the multiphasic 
screening system (1969), wrote a readable 500- 
page history of medical informatics (1995) 
that provides rich details about these early 
medical records systems, as does a three- 
decade summary of computer-based medical 
record research projects from the U.S. Agency 
for Health Care Policy and Research 
(AHCPR) (Fitzmaurice et al. 2002). 

EHRs can provide Clinical Decision 
Support (CDS) by suggesting needed action 
based on the patient data it carries. A few 
early systems: HELP (Warner 1972; Pryor 
1988) the RMRS (McDonald 1973, 1976) 
offered CDS as part of their initial design. 
Other early EHRs added CDS capability as 
they grew: the Columbia University system 
(Johnson et al. 1991; Hripcsak et al. 1999), the 
CCC (Center for Clinical Computing) system 
at Beth Israel Deaconess Medical Center 
(Rind et al. 1994; Slack and Bleich 1999; 
Bleich et al. 1985; Halamka and Safran 1998), 
and others (Giuse and Mickish 1996; Teich 
et al. 1999; Cheung et al. 2001; Duncan et al. 
2001; Brown et al. 2003). 

Since those early years, hundreds of com- 
mercial venders have emerged to supply out- 
patient practice computer systems and a few 
dozen offered for inpatient systems. However, 
as hospital systems merged into ever-larger 
aggregations and pulled office practices 
through acquisitions, the boundaries between 
outpatient and inpatient computer systems 
have blurred and the number of health care 
system EHR vendors has shrunk consider- 
ably. However, specialized clinical informa- 
tion systems that cover the special needs 
within large and complex health care systems 
including medical and imaging areas con- 
tinue. Today, a majority of health systems in 
the US use a select few EHRs provided by 
major health information technology (IT) 
vendors. 
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14.3 Functional Components 
of an EHR 


An EHR is not simply a recording of the 
patient’s clinical state but also has linkages 
and functional tools to facilitate communica- 
tion and decision-making. We summarize the 
components of a comprehensive EHR and 
illustrate functionality with examples from 
systems currently in use. The functional com- 
ponents are: 

1. Patient data capture, aggregation, and 
review 

Computerized provider order entry 
Clinical decision support 

Access to knowledge resources 

Care team and patient communication 
Billing and coding 


Sa le ae ad 


Increasingly, requirements around certified 
Health IT for EHRs are resulting in systems 
with specific functionality and system behav- 
ior. Certified Health IT system requirements 
have led to EHRs being more standards- 
based, and interoperable (2015 Edition 
Certification Regulations — 170.3157). 


14.3.1 Patient Data Capture, 


Aggregation, and Review 


Providing an integrated view of relevant 
patient data and having functionality to enter 
and supplement patient data are overarching 
EHR goals. However, EHRs may miss certain 
patient data, including (1) patient data exist- 
ing only on old paper records, (2) data from 
care provided outside of the current organiza- 
tion (e.g., unconnected office practices, free- 
standing radiology centers, home-health 
agencies, nursing homes), and (3) differences 
in data representation despite electronic and 
organizational links, the latter of which can 
be a result of different EHR vendors, different 
implementations of a given vendor’s system at 


2 Certification of Health IT, Testing Process & Test 
Methods, 2015 Edition Test Method. » https:// 
www.healthit. gov/topic/certification-ehrs/2015-edi- 
tion-test-method (Accessed 6/4/2020). 


different institutions, and by unwillingness to 
share data. Though some progress has been 
made in sharing EHR data among institu- 
tions especially when institutions have a com- 
mon EHR vendor, sharing of EHR data 
remains challenging. For example, institu- 
tions may choose different sets of functional- 
ity from vendors and employ different business 
rules and use different codes for identifying 
tests, measurements and treatment. 


14.3.1.1 Data Integration 


and Standards 


An integrated, mature EHR accommodates a 
broad spectrum of data types ranging from 
text to numbers and from signals (e.g., EKG 
waveform) to images as well as increasingly 
audio and video. More complex data such as 
radiology images are usually delivered for 
human viewing — via the DICOM? standard 
or general commercial imaging standards 
such as JPEG* or motion JPEG used for car- 
diac echocardiograms (see » Chap. 12). 
O Figure 14.1 shows an example screenshot 
of WorldVistA CPRS EHR, which integrates 
a variety of text data and images into a patient 
report data screen including: demographics, a 
detailed list of the patient’s procedures, a 
DICOM chest x-ray image, and JPG photo of 
a skin lesion. Other tabs in the system provide 
links to problems, medications, orders, notes, 
consults, discharge summary, and labs. 

In addition to challenges with EHR clini- 
cal data exchange, another important chal- 
lenge in the US to the construction of an 
integrated view of patient data between sys- 
tems is the lack of a national patient identi- 
fier. Because each organization assigns its 
own medical record number, a receiving orga- 
nization cannot directly map a local medical 
record number from an external care organi- 
zation to its own. Linking algorithms for 
identity management of patients are typically 


3 Digital Imaging and Communications in Medicine, 
> https://www.dicomstandard.org/ (Accessed 
6/4/2020). 

4 JPEG from Wikipedia, the free encyclopedia, 
> http://en.wikipedia.org/wiki/JPEG (Accessed 
6/4/2020). 
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O Fig. 14.1 A screenshot of the combined WorldVistA 
Computer Based Patient Record System (CPRS) and 
ISI Imaging system. These systems are derived from the 
Department of Veterans Affairs VistA and VistA Imaging 
systems (> http://www.va.gov/vista_monograph/). The 


based on name, birth date, and other patient 
characteristics such as address and employ- 
ment. The performance of these algorithms 
must be monitored for data integrity issues 
and errors and associated processes created 
to manage patient identity for cases where 
algorithms are not sufficient to adjudicate 
potential matches (Just et al. 2016; Zech et al. 
2016). 

One of the more significant barriers today 
to the integration of health record data from 
different organizations are the local and idio- 
syncratic identifiers used to label observations 
and coded observation values — recapitulat- 
ing the Babel story. However, those barriers 
are shrinking as Health IT regulations’ and 
institutions adopt terminology standards, 
including LOINC‘ for observations, questions, 


5 Certification of Health IT, Testing Process & Test 
Methods, 2015 Edition Test Method. » https:// 
www.healthit.gov/topic/certification-ehrs/2015-edi- 
tion-test-method (Accessed 6/4/2020). 

6 Logical Observation Identifiers Names and Codes 
(LOINC®) from Regenstrief. » http://loinc.org/ 
(Accessed 6/4/2020). 
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figure shows how clinical images can be presented with 
laboratory test results, medications, notes and other rel- 
evant clinical information in a single longitudinal medical 
record. (Source: Courtesy of WorldVistA (> worldvista. 
org) and II Group (> www.isigp.com), 2012) 


variables, and assessments (McDonald et al. 
2003; Vreeman et al. 2010); SNOMED CT’ 
(Wang et al. 2002) for diagnoses, symptoms, 
findings, organisms and answers; UCUM? for 
computable units of measure; and RxNorm?! 
for clinical drug names, ingredients, and 
orderable drug names for various purposes 
(see also > Chaps. 8 and 31). Supporting this 
trend are laboratory instrument vendors, 
which are beginning to specify LOINC codes 
to use for each of the tests results that their 
instruments generate.!! 

As healthcare providers consolidate and 
bring together EHR data or implement a new 


7 SNOMED Clinical Terms® (SNOMED CT®) Five- 
step briefing. > https://www.snomed.org/snomed-ct/ 
(Accessed 6/4/2020). 

8 The Unified Code for Units of Measure. > http:// 
unitsofmeasure.org/ (Accessed 6/4/2020). 

9 RxNorm Overview. » http://www.nlm.nih.gov/ 
research/umls/rxnorm/overview.html (Accessed 
6/4/2020). 

10 RxTerms. » https://wwwef.nlm.nih.gov/umlslicense/ 
rxtermApp/rxTerm.cfm (Accessed 6/4/2020). 

11 > https://ivdconnectivity.org/fda-encourages-livd 
(Accessed 6/4/2020). 
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O Fig. 14.2 A block diagram of multiple-source-data 
systems that contribute patient data, which ultimately 
reside in a computerized patient record (CPR). The 
database interface, commonly called an interface engine, 
performs a number of functions. It may simply be a 


EHR, organizations have used a number of 
approaches to load EHRs with pre-existing 
patient data. One approach is to interface the 
EHR from the available electronic source (e.g., 
dictation service, pharmacy system, and labo- 
ratory information system) and load data from 
these sources for a pre-specified length of time 
(e.g., 12 months). A second approach is to 
abstract select data (e.g., key laboratory results, 
the problem lists, and active medications) and 
either automatically or hand enter those data 
into the new EHR prior to each patient’s visit 
for a period of time. The third approach is to 
scan and store 1-2 years of the old paper 
records or to produce electronic “printout” 
versions (e.g., as Portable Document Format 
[PDF]) of content stored in by preceding 
EHR. This approach can be applied to any 
kind of document, including handwritten 
records that predate the EHR installation. 
Optical Character Recognition (OCR) capabil- 
ity is built into most document scanners today, 
and converts typed text within scanned docu- 
ments to computer understandable text with 
98-99% character accuracy, which can make 
this content potentially searchable. 


Databases 


Entities 
Dictionary 


router of information to the central database. It may 
also provide more intelligent filtering, translating, and 
alerting functions, as it does at Columbia University 
Medical Center. (Source: Courtesy of Columbia 
University Medical Center, New York) 


Today, most clinical data sources and 
EHRs can send and receive clinical content as 
version 2.x Health Level 7 (HL7)!? messages. 
Most organizations use interface engines for 
HL7 messages and integration platforms (either 
part of the integration engine or separate tech- 
nology platform) that can support other data 
formats to send, receive, and, when necessary, 
translate the format of and the codes within 
exchanged data (see > Chap. 17); B Fig. 14.2 
shows an example of architecture to integrate 
data from multiple source systems. The 
Columbia University Medical Center comput- 
erized patient record (CPR) interface depicted 
in this diagram not only provides message-han- 
dling capability but can also automatically 
translate codes from the external source to the 
preferred codes of the receiving EHR. And 
although many vendors now offer single sys- 
tems that serve “all” needs, they never escape 
the need for standards-based data exchange 
from ancillary systems (e.g., EKG carts, cardiol- 


12 Health Level Seven International, » http://www. 
h17.org/ (Accessed 6/4/2020). 
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ogy systems, radiology imaging systems, anes- 
thesia systems, off-site laboratories, community 
pharmacies and external collaborating health 
systems). At least one high-capability open- 
source interface engine, NextGen Connect (for- 
merly Mirth Connect),!*-'* is available and used 
relatively widely for data exchange. 

HL7 Fast Healthcare Interoperability 
Resources (FHIR) is an elegant application- 
programming interface for exchanging clini- 
cal data (see > Chap. 17) with a recognizable 
heritage from V2. In 2018, it was embraced by 
a surge of large organizations including 
Apple, (Apple Health), Microsoft and Google 
(Mandl et al. 2019), many federal agencies 
(CMS, ONC, the Veterans Administration), 
major EHR vendors, and health insurance 
companies. Most of them are also using the 
related, SMART on FHIR App specification 
(Mandel et al. 2016), with which users can 
develop Apps designed to access data within 
EHRs from outside of that EHR. 


14.3.1.2 Clinician Data Entry 


Clinical data may be entered as narrative free- 
text, as codes, or as a combination of the two. 
Trade-offs exist between the use of codes and 
narrative text. The major advantage of struc- 
tured data is that it makes the data “under- 
standable” to the computer and thus enables 
selective retrieval, clinical research, quality 
improvement, and clinical operations man- 
agement. The coding of diagnoses, allergies, 
problems, orders, and medications is of par- 
ticular importance for these purposes. 
Because of the chance of errors with the 
hand entry of data, EHRs apply validity checks 
scrupulously. A number of different kinds of 
checks apply to clinical data (Schwartz et al. 
1985). Range checks can detect or prevent entry 
of values that are out of range (e.g., a serum 
potassium level of 50.0 mmol/L—which is 
impossibly outside the normal range of 3.5- 
5.0 mol/L). Pattern checks for including regular 


13 » https://github.com/nextgenhealthcare (Accessed 
6/4/2020). 

14 NextGen Connect Integration Engine. » https:// 
www.nextgen.com/products-and-services/integra- 
tion-engine and > https://github.com/nextgen- 
healthcare (Both accessed 6/4/2020). 
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expressions can verify that the entered data have 
a required pattern (e.g., the three digits, hyphen, 
and four digits of a local telephone number). 
Range and pattern checks (among others can 
be implemented using standard browser 
features).Computed checks can verify that val- 
ues have the correct mathematical relationship 
(e.g., white blood cell differential counts, 
reported as percentages, must sum to 100). 
Consistency checks can detect errors by com- 
paring entered data (e.g., the recording of can- 
cer of the prostate as the diagnosis for a female 
patient). Software for accomplishing this is 
embedded in standard web browser). Delta 
checks warn of large and unlikely differences 
between the values of a new result and of the 
previous observations (e.g., a recorded weight 
that changes by 100 lbs. in 2 weeks). Spelling 
checks verify the spelling of individual words. 

Clinician-gathered patient information 
requires special comment because it presents 
one of the most difficult challenge to EHR 
developers and users. Physicians spend at least 
20% of their time documenting the clinical 
encounter (Gottschalk and Flocke 2005; 
Hollingsworth et al. 1998). The burden has risen 
over time for several reasons (Poissant et al. 
2005). EHRs tend to require far more data 
entry than the pre-existing manual systems. 
Many studies suggests that the EHR functions 
taken together may consume up to 1-2 hours of 
the physician’s free time per clinic day (Sinsky 
et al. 2016; McDonald et al. 2014). In one study, 
the computer system was a primary cause of 
clinician dissatisfaction (Edgar 2009) and their 
reason for leaving military medicine. In addi- 
tion, EHR documentation requirements have 
been repeatedly been cited as a significant cause 
of physician burnout (Gardner et al. 2019). 
EHRs tend to forbid simple narrative text 
responses; so providers have to dig through 
menus to find the coded term that expresses 
their intended meaning. Billing requirements 
and fear of malpractice have fueled the demand 
for ever more data entry. Adding to provider’ 
data entry pain is the fact that EHR user inter 
faces can be clumsy and non-intuitive. 

The requirement that providers enter all of 
their findings, impressions and plans into 
their note and orders into the computer comes 
from a dictum that says the person who cre- 
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ates the content should enter it. This dictum 
often makes sense for prescriptions, orders, 
and perhaps diagnoses and procedure orders, 
because immediate provider entry during the 
course of care makes diagnostic testing, treat- 
ments and check out more efficient and pro- 
vides crucial grist for CDS. The justification 
for direct entry of visit notes by clinicians is 
weaker because of its time cost to physicians 
is high and the information is not a pre- 
requisite to the checkout process. 

Today, clinical notes can be entered into 
the EHR via one of three general mecha- 
nisms: (1) transcription or use of speech rec- 
ognition systems to convert spoken word to 
dictated or written notes, (2) clinic staff 
(scribes) who transfer text or codes of provid- 
ers written or spoken content into the com- 
puter system, and (3) direct data entry by 
physicians themselves (potentially facilitated 
by electronic templates or macros). 

Free-form narrative entry—by typing, dicta- 
tion, or speech recognition—allows clinicians to 
express important clinical information in the 
most natural manner. When clinicians commu- 
nicate in narrative, they naturally prioritize find- 
ings and leave much information implicit. For 
example, an experienced clinician often leaves 
out “pertinent negatives” (i.e., findings that the 
patient does not have but that nevertheless 
inform the decision-making process) knowing 
that the clinician who reads the record will inter- 
pret them properly to be absent. The result is 
usually a more concise history with a high sig- 
nal-to-noise ratio that not only shortens the data 
capture time but also lessens the cognitive bur- 
den on the reading clinician. Weir and colleagues 
present compelling evidence about these advan- 
tages, especially when narrative is focused and 
vivid, and emphasize that too much information 
interferes with inter-provider communication 
(Weir et al. 2011). 

Most EHRs let physicians cut and paste 
notes from previous visits and other sources or 
even have automated functionality to “bring 
forward” content from previous visits. For 
example, a physician can cut and paste parts of 
a visit note into a letter to a referring physician 
and into an admission note, a most appropriate 
use of this capability. However, when over-used, 
this can cause ‘note bloat.’ In addition, without 


proper attention to detail, users may copy infor- 
mation that is no longer pertinent or true or pos- 
sibly lose context especially with time expressions 
(e.g., “yesterday”). Studies note high rates of 
text duplicated from previous notes of over 50% 
(Wrenn et al. 2010; Zhang et al. 2014). 

Dictation with transcription has been com- 
mon historically for entering narrative informa- 
tion into EHRs. Transcriptionists are often able 
to maintain a degree of structure in the tran- 
scribed document via section headers, and the 
structure can also be delivered as an HL7 CDA 
document (Ferranti et al. 2006). Speech recogni- 
tion software offers an approach to “dictating” 
without the cost or delay of transcription by 
translating clinician speech to text automati- 
cally. Historically, these systems resulted in 
errors that required significant time to find and 
correct misunderstood words. Increasingly, 
these solutions have improved speech recogni- 
tion algorithms, and companies can reach accu- 
racies better than 99% without training. 
Skeptical readers can try it themselves. !5 

In addition, some dictation services use 
speech recognition to generate a draft transcrip- 
tion, which the transcriptionist corrects while 
listening to the audio dictation, thus saving tran- 
scriptionist time. Natural-language processing 
(NLP) (see > Chap. 9) offers hope for automatic 
encoding of narrative text (Nadkarni et al. 
2011). Some companies are exploring the use of 
NLP to auto-encode transcribed text, and 
employ the transcriptionist to correct any NLP 
coding errors (see > Chap. 9). 

Some practices have scribes (a variant on 
the stenographers of old) to do much of the 
physicians’ data entry work (Koshy et al. 2010; 
Misra-Hebert et al. 2016). Scribes typically 
work alongside the care provider in the exami- 
nation room, or remotely through an audiovi- 
sual connection or recording. The Joint 
Commission is agnostic about the use of 
scribes, but provides guidelines for their use 
(The Joint Commission 2018!°). 


15 » https://cloud.google.com/speech-to-text/ 

16 The Joint Commission: Documentation assistance 
provided by scribes. » https://www.jointcommission. 
org/en/standards/standard-fags/nursing-care-center/ 
record-of-care-treatment-and-services-rc/000002210/ 
(Accessed 6/4/2020). 
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Another data-entry method is to have clini- 
cians record information on a structured form, 
from which data and associated documenta- 
tion are created (Downs et al. 2006; Hagen 
et al. 1998). One system, called CHICA 
(Anand et al. 2018), originally generated a 
patient specific and scannable paper document 
and used optical character and mark recogni- 
tion to capture the recoded data in a two-step 
process. Today, data can be data captured from 
a handheld electronic tablet or via a webpage 
(Anand et al. 2017). In addition to a child-spe- 
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cific data-capture form completed by the 
child’s family in the waiting room 
(0 Fig. 14.3a), the CHICA computer uses the 
entered data to generate a physician encounter 
form with a tailored agenda for the encounter 
(O Fig. 14.3b). The CHICA system generates 
a prose version of associated form responses 
which can be incorporated into and used to 
help generate a clinical note. 

The third alternative is the structured, 
coded entry of data by clinicians. A major 
issue associated with direct physician entry is 


Pre-Screener: 


Patient 


Has Ima ever had epilepsy or more than one seizure with stiffness or jerking? 


Does Ima have a brother or sister with autism? 


a. ~ 


Has Ima had hard, large stools for more than 2 weeks? 


N/A 


cI - 


Do you have concerns about Ima's development? 


r- E ~ 


Does Ima have sickle cell disease? 


= E ~ 


O Fig. 14.3 a The family completes the first form with 
questions tailored to patients age and other factors. 
Form can be displayed on a tablet or printed on a tai- 
lored paper form that is scanned by an OCR system that 
passes the content to the EHR. b The computer gener- 
ates a physician encounter form based on the contents 
of the first form and adds reminders. The form is dis- 


played as a web form in the EHR or printed on paper 
that an OCR system interprets. Coded results are stored 
in the computer and a prose version is returned by the 
system to be incorporated in the physician’s note. 
(Source: Courtesy of Prof Stephen M Downs, Indiana 
University, Indianapolis, IN) 
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Patient, (F) 
MRN: Date: a Recommended 
DOB: Time: 
Age: Language: English | e Other 
Provider: Other Informant: 


*=Abnormal, A=Axillary, R=Rectal, O=Oral 


Quality Indicators - 


Special Need Child Screened for abuse 


Two ID's Checked 


Medication Education Performed and/or Counseled orVaccines © Y — N O N/A 


Pre-screening Results = 


==INFORMANT: DEVELOPMENT== 
Denies concerns about development. 


==INFORMANT: MEDICAL HISTORY== 


++Family reports the child having had epilepsy or more than one seizure with stiffness or jerking.++ 


+Reports signs of constipation in past 2 weeks.+ 
Denies sickle cell disease. 


==|NFORMANT: FAMILY HISTORY== 
Denies having sibling with autism. 


Physician Prompts 


Ima has had 5 seizures in the past year, putting her at INCREASED RISK for sudde# CONSTIPATION: Ima reports large hard stools, ri 
= more than 5, child does not always 
get meds, parent sometimes forgets child's meds, they sometimes run out of seizure 


death (SUDEP). Family reports: Missed meds 


RR: Weight: 
BP: Prev Weight: 
Pulse Ox: 


Discussed healthy diet Discussed physical activity 


ecentlyCheck for red flags: weight 
loss, anorexia, vomiting.Check for impaction with a rectal exam. 


meds, they have trouble affording medications, child hasn't seen a neurologist in ove 


a year, they have trouble bringing child to neurologist, and it is recommended that 
you discuss SUDEP (Handout). 


Discuss risk of SUDEP 
Referred to Neurol 
No Seizures past 12 mos 


Shared handout 
Discussed meds 
Does NOT have epilepsy 


(See handouts) 

— Rec: NO BOTTLE — Rec: parent help brush BID 
Advise to see dentist ' Gave dentist handout 
Completed oral exam Has a dental home 


Pb level indicated if:Ima 1) exposed to sib/playmate with hi Pb, 2) someone who 
works with lead, 3) lives in home built before 1950, 4) home built before 1978 
recently renovated. 


Pre-1950 house Pre-1978 renov. house 
- Lead + contact Contact that works with lead 
Ordered / Done elsewhere No risk factors 


O Fig. 14.3 (continued) 


the physician time cost. Studies document 
that structured data entry consumes more cli- 
nician time than the traditional record keep- 
ing (Chaudhry et al. 2006), as much as 
20 seconds per SNOMED CT coded diagno- 
sis (Fung et al. 2011). On the other hand, this 
option has the advantage that the computer 
can immediately check the entry for consis- 


Remind family to help child brush teeth twice daily, and to see a dentist every year. 


Red flags present -> 
impaction present -> 
No red flags or impaction 


) Order T4, TSH, TTG IgA, IgA. 
Use clean out JIT 
' No constipation 


Ima is due for her formal 24-36 month developmental screeningCheck below to 
indicate that you have scored ASQ and discussed with family. CPT 96110 


— Suspect Delay -> — Refer to First Steps 
Activities for Children form ! Sched follow-up 1 month 
ASQ done, scored, discussed Do not suspect delay 


Guns at home or in homes where a child visits or is cared for increase the risk of 
injury to children. The AAP says removing guns from the home is the best way to 
prevent injury. If guns must be kept, they should be stored appropriatelyPlease 
review: 


No guns in home l 
— Guns in home -> 
Asked about guns at friends’ 


Provided gun handout 
L Store unloaded, locked 
Store away from ammo 


© Save Draft p Sign 


tency with previously stored information and 
can ask for additional detail or dimensions 
conditional on the information just entered. 
Some of these data will be entered into fields, 
with menu selection. For ease of entry, such 
menus should not be very long, require scroll- 
ing, or impose a rigid hierarchy (Kuhn et al. 
1984). Using a process called auto complete, 
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clinicians can code items by typing in a few 
letters of an item name, then choose the item 
they need from the list of items that match the 
string they entered. In some cases this process 
can be fast and efficient, but critics have 
described it as “death by a thousand clicks” 
(Fry and Schulte 2019). 

The use of templates and menus can speed 
note entry, but they can also generate excessive 
boilerplate and discourages specificity (1.e., it is 
easier to pick an available menu option than to 
describe a finding or event in detail). Notes 
written via templates may not convey as clear, 
or as accurate, a picture of the patient’s state as 
a provider note written in narrative. Developers 
can develop separate data capture forms using 
APIs specific to the manufacturer resulting in 
forms that are relatively easier to use and often 
can adapt to various form factors. 

Among its many other capabilities, FHIR 
has developed specifications for web data cap- 
ture forms. Questionnaire!” is FHIR’s data cap- 
ture resource. Questionnaire supports skip 
logic, nesting and repeating groups of questions 
and many simple data validation checks. It also 
supports nested, and repeating, groups of ques- 
tions and thus can accommodate complicated 
forms such as the surgeon general is family his- 
tory (see © Fig. 14.4a). FHIR has developed 
an enhanced version of Questionnaire called 
Structured Data Capture (SDC)" in STU4 trial 
use. It adds many capabilities to Questionnaire, 
including arithmetic and logical calculations 
(see B Fig. 14.4b — BMI and Ø Fig. 14.4c — 
Apgar score) and mechanisms for pre-populat- 
ing forms with existing patient data from an 
EHR. It also includes regular expressions for 
validation of data entry along with mechanisms 
for storing form content data into designated 
FHIR resources. Finally, SDC supports adap- 
tive (CAT) survey instruments such as 
PROMIS” patient reported outcomes. 


17 » https://www.hl7.org/fhir/questionnaire. html 
(Accessed 6/4/2020). 

18 » http://hl7.org/fhir/uv/sdc/2019May/ 
6/4/2020). 

19 » http://www.healthmeasures.net/explore-measure- 
ment-systems/promis/intro-to-promis (Accessed 
6/4/2020). 


(Accessed 
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The long-term solution for effective data 
capture of information generated by clini- 
cians is still evolving. Semi-structured data 
entry combines the use of narrative text fields 
amenable to natural language processing 
combined with structured data entry fields 
where needed. With time and better input 
devices, acquisition of data will become 
faster and easier. In addition, direct entry of 
some data by patients could reduce the data 
entry burden for clinicians (Janamanchi et al. 
2009). 


14.3.1.3 Data Display 


Once stored in the computer, EHRs can pres- 
ent patient data in different formats for differ- 
ent purposes. These systems can also present 
content in novel formats. Clinicians need 
more than just integrated access to patient 
data; they also need various views of these 
data: in chronologic order as flowsheets or 
graphs to highlight changes over time, and as 
snapshots that show a computer view of the 
patients’ current status and their most impor- 
tant observations. 


Timeline Graphs 

A graphical presentation can help the physi- 
cian to assimilate and draw conclusions 
from the information quickly and draw con- 
clusions (Fafchamps et al. 1991; Tang and 
Patel 1994; Starren and Johnson 2000). An 
anesthesia system vendor provides an espe- 
cially good example of the use of numbers 
and graphics in a timeline to convey the 
patient’s state in form that can be digested at 
a glance (Vigoda and Lubarsky 2006). 
Sparklines — “small, high resolution graph- 
ics embedded in a context of words, num- 
bers, images” (Tufte 2006), which today’s 
browsers and spreadsheets can easily gener- 
ate — provide a way to embed graphic time- 
lines into any report. One study found that 
with sparklines, “physicians were able to 
assess laboratory data faster”. Sparklines 
enable more information to be presented 
more compactly in a single view and thus 
reduce the need to scroll or flip between 
screens” (Bauer et al. 2010). 
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O Fig.14.4 a AFHIR 
questionnaire with the same content 
as the Surgeon General’s Family 
History, rendered by the NLM 
FHIR Questionnaire App. The 
questionnaire asks questions about 
the proband and about each disease 
the proband experienced and when 
it occurred. These two questions 
repeat for each such disease the 
proband has experienced. Then it 
asks almost the same set of 
questions about each of the 
proband’s relatives as it asked about 
the proband, and this whole set of 
questions repeats for as many 
relatives as the user wants to enter. 
The same pair of questions about 
disease and age range also repeat 
within each set of questions about 
relatives. See FHIR definition for 
such forms at > http://hl7.org/fhir/ 
uv/sdc/2019May/. b An SDC form 
rendered by the NLM FHIR 
Questionnaire App that captures 
height and weight (among other 
things) and automatically computes 
BMI as soon as these data elements 
are available via a FHIRpath 
expression. More information about 
FHIRpath can be found here: 

> https://github.com/Ihncbe/ 
fhirpath.js. Try it live on the Demo 
URL (» https:// 
Iheforms.nlm.nih.gov/sdc). c A 
rendered SDC form for the Apgar 
instrument to rate the health of a 
newborn. The answer to each 
question has an associated 
pre-defined score. This SDC form 
computes the overall Apgar score by 
adding the answers’ scores as the 
user selects them. The overall score 
appears at the bottom of the form. 
See demo URL (> https:// 
lhcforms.nlm.nih.gov/48334-7). You 
can click on the gears to change the 
input control 


US Surgeon General family health p< 


My neann history 
Name 
Gender 


Burn Date 


Erny 


EEJ 2522525 nistory panes 
History of oiseases 
L- Age range at onset of disease 
12] Diseases history panel 
i History of diseases 


Age range at onset of disease 


EE Oscases nistory panei 


History cf ciseases 


-E Famty member near nistory 


Relationsne to patient @ 


Living? 
Date of Beth 


L current age 


- Eihnicey 


Parents related 

EE cise ases nistory panes 
[ History of ciseases 

12] Otseases history panel 


History of ciseases 


\ Age range at onset of cisease 


E Fanny member neann nistory 


Relationship to pavent @ 


Etnnary 
Parents retated 


[21] Diseases history panel 


i History of ciseases 


Age range at onset of disease 


+ Add another "Di 


+ Add another "Family member health history” 


Valve Units 
Jack Lannon 
Male - 
06047009 B t) 
No ~ 
No - 
No - 
a foi 
231 +9 
(eee) 
Select one of more ” 
= UnknownNo answer 
Select one cu more + 
8 
Chickenpos (Varicella) a 
Inlancy - 
B 
Asthma a 
Childhood ~ 
a 
Search for value a 
Setect one ” 
diseases as needed 
NMTH. Mother - 
Doneile Kamu 
Fernate - 
Yes v 
ra E vosy) 
Type a nurntes . 
No - 
Yes - 
(=m) 
Select one cu m - 
= Unknown/No answer 
Select one of more . 
No - 
B 
Epilepsy a 
Adolescence < 
B 
Search for vakuo a 
Select one ” 
u 
Select one - 
Type a valve 
Select one ” 
Select one - 
Select one - 
Select one ~ 
Select ane or m . 
Select one or more - 
Select one - 
Search tor valve a 
Select one 
Can enter as many family 
members as needed 
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Vital signs, weight, height, head circumference, oximetry, BMI, & BSA panel 


Name 
Sa02 % BIdC Oximetry 
Weight Measured 


Head Circumf OFC by Tape measure 


Bay nieighit 


Body temperature 
BP dias 
BP sys 
Heart rate 
Resp rate Value was completed from the 
height and weight using 
BSA Derived FHIRpath expression* 
embedded in the form 
BMI 
c 
Display Question Code Show Help/Description 


Keyboard Navigation On Input Fields 


Value Units 
99 % 
180 lbs Users can v 
choose 
57 cm inches or 
feet via the 


drop down 


37.1 Cel 

115 mm{Hg] 

75 mm{Hg] 

75 {beats}/min 
16 {breaths)/min 
Type a number m2 

26.6 kg/m2 


Total # of Questions: 6 


Apgar panel*5M post birth X 


Date Done Time Done 


MMIDDIYYYY 


Name 
5M Apgar Color Xp 
5M Apgar Heart rate X 
5M Apgar Reflex irritability 2 
5M Apgar Muscle tone 2 


pa 


SM Apgar Resp effort X 


SM Apgar Score 
O Fig. 14.4 (continued) 


Timeline Flowsheets 


O Figure 14.5 shows an integrated view of a 
flowsheet of the radiology impressions with 
the rows representing different kinds of radi- 
ology examinations and the columns repre- 
senting study dates. Clicking on the radiology 
image icon brings up the radiology images. 
O Figure 14.6 shows the previously highly 
popular pocket rounds report that provides 
laboratory and nursing measurements as a 


= [Type a value Select ortype a value 


Where Done Comment 


wv Typea value 


Value Units 

1. Good color in body with bluish hands or feet - 1 v 

0. No heart rate - 0 v 

1. Grimace during suctioning - 1 v 

2. Active motion -2 v 
; 2. Good, strong cry; normal rate and effort of breathing - 2 v 

6 Z, {score} 


very compact flowsheet that fits in a white 
coat pocket (Simonaitis et al. 2006). 
Flowsheets and other formats can be spe- 
cialized for management of a particular prob- 
lem. For example, a flowsheet used to monitor 
patients who have hypertension (high blood 
pressure) and might contain values for weight, 
blood pressure, heart rate, and medications 
that control hypertension with doses as well as 
results of laboratory tests that monitor 
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MRF_TBL_DISPLAY - Microsoft Internet Explorer provid 


- Addess |ë) bhtp://iaicon iupu edu $110/‘REGEN/0/load/top subdoc 
DEMO, JONATHAN DOE #99999999-8 8 @REGEN DEVELOP M Age: 56 years 


E i oe 


RADIOLOGY 


I Abdomen CT 


I Abdomen MRI 


OVERHAGE JOSEPH M 


Browse Patient Record»Flows 


F Abdomen XR 


z MI> 


I Chest PA & Lat XR Hean 


O Fig. 14.5 A flow sheet of radiology reports. The 
rows all report one kind of study and the columns report 
one date. Each cell shows the impression part of the 
radiology report as a quick summary of the content of 
that report. The cells include two icons. Clicking on the 
report icon provides the full radiology report. Clicking 


complications of hypertension, or the medica- 
tions used to treat it. Physicians at the 
University of Wisconsin and University of 
Texas-Southwestern have developed tables of 
mappings from problem categories (e.g., renal 
failure, ischemic heart disease) to observa- 
tions, and to medications, with mappings 
developed through a multi-institutional con- 
sensus process with universal code specifica- 
tions that use SNOMED CT or ICD (for 
defining problem classes), LOINC for obser- 
vations, and RxNorm for medications 
(@ Fig. 14.7) (Buchanan 2017; Willett et al. 
2018). Physicians at the University of Wis- 
consin and University of Texas-Southwestern 
have developed Problem Concept Maps that 


ceaton ? 
IMPRESSION: 
nonspecific 
bowel gas 
pattern? 


X-ray report icon | ray report ico 


on the radiology image icon provides the images. 
(Source: Courtesy of Regenstrief Institute, Indianapo- 
lis, IN). The CareWeb program from Regenstrief, which 
generated this flowsheet, presents cross-institutional 
patient flowsheets from the Indiana HIE to office prac- 
tices today 


will be available for free download”?! under a 
LOINC-like agreement (@ Fig. 14.7). In this 
example of a Problem Oriented View, displays 
medications and lab results relevant to the 
problem of acute systolic heart failure. If a 
user chose a different problem in the panel on 
the left, it would display content relevant to 
that problem. 


20 125 SNOMED CT groupers published as online sup- 
plement to open-access article above in reference #3. 
> https://www.thieme-connect.de/media/10. 
1055-s-00035026/201803/supmat/10-1055-s-0038- 
1668090-s180031ra.pdf (Accessed 6/4/2020). 

21 See Problem List MD at > https://problemlist.org 
(Accessed 6/4/2020). 
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Patient: SAMPLE, JOHN #. 123456-7 Ward 6N Bed: W0123Y 


Dengeuind Contat 000-12 Mstar! 


ui aid 
ee MEER 
CE vn 


po ind Foe 
Acar draus 5) Aber Meraai Datas 
premem > 


UAIL wh aia tanie 
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Race: W t 04 FEB 1999 


ORagenstrat nattte Inc 


Logs Atncenale: 00) 978 


ones RL 


New cay deimteity] New mont demiei by] Mew ymar deletes vrii 


G Fig. 14.6 The Pocket rounds report—so called 
because when folded from top to bottom, it fits in the 
clinician’s white coat pocket as a booklet. It is a dense 
report (12 lines per inch, 36 characters per inch), printed 
in landscape mode on one 8 1/2 x 11 in. page), and 


The LHC Flowsheet on FHIR app (named 
for the NLM’s Lister Hill Center), is an open 
source web app (> https://Ihcflowsheet.nim. 
nih.gov/) that generates clinical flowsheets 
from any FHIR server with LOINC coded 
observation identifiers (@ Fig. 14.8a, b). It 
can display very large datasets (the example 
presents a patient with more than 20 thou- 
sand observations), can scroll quickly in the X 
and Y-axis and show or hide selected groups 
of variables. This app can also provide a 
problem-focused flowsheet as well as time axis 
(column) compression, variable (row) axis 
compression and units’ conversion according 
to user preferences. It can also convert obser- 
vations with mixed units of measure into a 


includes the all active orders (including medications), 
recent laboratory results, vital signs and the summary 
impressions of radiology, endoscopy, and cardiology 
reports. (Source: Courtesy of L. Simonaitis, Regenstrief 
Institute, Indianapolis, IN) 


preferred unit of measure using LHC’s unit 
validation/converter (UCUM).”2 


Summaries and Snapshots 

EHRs can highlight important components 
(e.g., active allergies, active problems, active 
treatments, and recent observations) in clini- 
cal summaries or snapshots (Tang et al. 
1999). @ Figure 14.9 shows an example from 
the Epic EHR where active patient problems, 
active medications, allergies, health main- 
tenance reminders, and other relevant sum- 


22 » https://ucum.nim.nih.gov/ucum-lhc/demo.html 
(Accessed 6/4/2020). 
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PROBLEM LIST 


1 
Type 2 diabetes mellitus (*) nozel 


NICM (nonischemic 
cardiomyopathy) (*) 
PAF (paroxysmal atrial 
fibrillation) (*) 

Typical atrial flutter 
Systolic CHF, chronic (*) 


Biventricular implantable 
cardioverter-defibrillator (ICD) 
in situ 

Other problems (8) 


Relevant Results 
Component 
Ref Range & Units 


Cardiac Profile 
NT-proBNP 
<900 pg/mL 
Cbc 
HEMATOCRIT 


HEMOGLOBIN 


Chem Profile 


BUN 

6 - 23 mg/dL 
CREATININE 
0.67 - 1.17 mg/dL 
POTASSIUM 

3.6 - 5.0 mmol/L 
SODIUM 

135 - 145 mmol/L 


12d ago 
(10/ /19) 


Systolic CHF, chronic (*) 


Relevant Current Medications 
WM digoxin (LANOXIN) 125 mcg oral tablet 
Take 1 Tab (0.125 mg total) by mouth every other day 
{À metOPROLol succinate (TOPROL XL) 25 mg 24-hr oral tablet 
Take 1.5 Tabs (37.5 mg total) by mouth daily. Do not crush or chew 
{I ENTRESTO 97-103 mg Oral 
TAKE ONE TABLET BY MOUTH TWICE A DAY 
{I metOLazone (ZAROXOLYN) 2.5 mg oral tablet 
Take 1 tab only as directed for weight gain, not to exceed one tab every other day 
{J torsemide (DEMADEX) 100 mg oral tablet 
Take 2 tabs in the am and 1.5 tabs in the pm or as directed 


2wk ago 
(10/ /19) 


2wk ago 
(10/ /19) 


2wk ago 
(10/ 719) 


3,138 A 


© 2019 Epic Systems Corporation.Used with permission. 


O Fig. 14.7 In this example of a Problem Oriented 
View, displays medications and lab results relevant to 
the problem of acute systolic heart failure If a user 


mary information are summarized. These 
views are automatically updated and kept 
current as new data arrives. In the future, we 
can expect more sophisticated summarizing 
and surveillance strategies, such as auto- 
mated detection of adverse events (Bates 
et al. 2003) or automated time-series events 
(e.g., cancer chemotherapy cycles). We may 
also see patient data views that distinguish 
abnormal changes that have been explained, 
or treated, from those that have not, and dis- 
plays that dynamically organize the support- 
ing evidence for existing problems (Tang and 
Patel 1994; Tang et al. 1994a; Buchanan 
2017). Ultimately, computers should be able 
to produce concise and flowing summary 
reports that are like an experienced physi- 


chose a different problems in the panel on the left, it 
would display content relevant to that problem. (© 2019 
Epic Systems Corporation. Used with permission) 


cian’s hospital hand crafted discharge sum- 
mary. 

Researchers are developing increasingly 
sophisticated summaries. The HARVEST sys- 
tem (Hirsch et al. 2015), for example, pro- 
cesses all of a patient’s notes, extracts unique 
concepts, ranks them in importance for the 
patient, and displays them in a word cloud. 
Clicking a word reveals the notes that support 
the concept, and clicking the individual note 
reveals the relevant snippet(s) of text. The 
user can pick a subset of the patient’s time- 
line. The system reveals information that the 
user may not know to ask for and has been 
found to be especially useful for emergency 
department doctors and quality assurance 
nurses. 
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#2 Apps @ Form Widget Demo |} CSV 10 /SON-CSV. A searchicinc Gh Research, Statistics, © Pimyin with tonema. © Left side) - Sanda. @ NewTsb wu Today's Paper US » E Other bookmarks 
| LHC Flowsheet On FHIR a iin 
amn 
| ki Overview Map | Retrieved Observations: 16375 Total Observations: 16375 Columns 66 Rows: $49 B 
Name Sparkline 2149 Q2 2149 Q1 2148 Q4 2148 Q3 2148 Q2 2148 Q1 2147 Q4 2147 Q3 
+rGlucose mass conc Amanida 101 mg/dL 59 mg/dl + 92 mg/dl 167 mg/dl F 107 mg/dl 77 mg/dl 114 mg/dl 
Glucose molar conc A—nentahhhiornind 4.66 mmol/L 6.77 mmol/L 6.27 mmol/L 10.27 mmol/L... 2.66 mmol/L + 6.77 mmol/L 
+)Glucose mass conc (Bld) Kubi 114 mg/dl $ 231 mg/dl  145mg/dL ® 138 mg/dl + 73 mg/dl 91 mg/dl 
Glucose molar conc (Bid) Ainmählnn 1.83 mmol/l FT 4.5 mmol/L 6.05 mmol/L 871 mmol/L # 5.44 mmol/L 12.43 mmol/L... 
Glucose mass conc (BIdC) en 
© CALCIUM TESTS 
+rCalcium mass cone CE Sugi 8.74 mg/dl 865 mg/dL 807 mg/dl + 828mg/dl+ Bma/dis 845 mg/dl + 
Calcium moles conc RER 226 mmol/l 2.03 mmol/L + 1.81 mmol/l + 217 mmol/L è 25 mmol/L 2.39 mmol/L 2.42 mmol/L 
‘Calcium,ionized mass conc 
© ALBUMIN TESTS 
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‘Albumin molar conc ew 603.03 umol/l 650 umol/! 518.18 umol/l... 486.36 umol/l... 656.06 umol/l 627.27 umol/l 
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O Fig. 14.8 LHC Flowsheet on FHIR app presenting 
the content of a large, (more than 15,000 observations) 
de-identified medical records. The dates and values 
within the dataset have been slightly shifted from their 
original values and salted with additional data using 
random methods. In this display, the user may choose to 
collapse columns to one value per quarter to facilitate 
user interpretation. Out of range results are flagged as 


Dynamic On-Demand Data Views 


Anyone who has reviewed a patient’s chart 
knows how hard it can be to find a particular 
piece of information. From 10% (Fries 1974) 


high (red up arrow) or low (blue down arrow) using HL7 
interpretation codes and normal value ranges. A spark 
line appears graphing all of the values in a row, appears 
to the right of each observation name. In a, values for 
each observation have separate rows, whereas b all of the 
values for within one equivalence class are folded into a 
single row after converting all values with any molar or 
mass unit into a common, but configurable, mass unit 


to 81% (Tang et al. 1994) of the time, when 
looking in the paper record, physicians did not 
find important patient information that was 
present in that record. Furthermore, the ques- 


14 


486 G.B.Melton et al. 


ray 


Ji + P a a Boan B a a B a a On More + 
Snapshot Encounters Labs Imaging Procedures ECG Medications Other Orders Letters Episodes Notes Outside Records Media 
E = [if SnapShot |E Cunent Orders [El Focesheet [E Repatries Report: [SnapShot Ps ® 
Chart Review 
7.17 7, 177 (cc 
m Diabetes mettus - Tyee 2 Inversa 101712011, 1130/1998, 102271997, 
Essental hrpertenssen PPY? Preumococca! m 
Obesity bors acchande) neret 
Hipempigomia Tetanus Orpninens: 171771992 
p a 
y LZ? Chot Compia > _ Qmnuomeso Ome [Due O Soon, Peta, 
ER Diabetes Felowep Topic Ove Most Recent Outreach 
© Cotenascopy 127152008 
feet. O Hgo Ate (0 3mo) 142012 
B © nycrochioroiniande MYOROCIIRIL) 25 UG tat Inbuenca Vaccine 10172012 
& metormin (GLUCOPHAGE-XR) £09 UG 24 he tablet Tetanus immencabon 127162018 
Q opt PRIUVE_ZESTRE) 5 UG aiet 
Q+ Care Team and Communications © 
| & siwastate ZOCOR) 10 MG tablet m 
a Ratersng Prodcer 
PA tiie 
PENECILLINS Rash PCPs Tyre 
ast Reviewed by on 1201999 at Drew Walker, UD General 
er Patent Care Team Members 
Ò sonncaniasoyosass 5 A 
Smoking: Former Smoter (Out Date 91/06/1999). 1 ppd. 36 paceyears SNO — 
Smokeless Tobacco Never Usec = = 5 = — z 
Alcohak 1.0 og acoholweek Cing. Endocrinologist 
a ey Lisa Connelly, RIOU Diabetes Ecucatce 
No open orsers Reckcleets o Past Com 
None 


DI Rests 


O Fig. 14.9 Summary record. The patient’s active 
medical problems, current medications, and drug aller- 
gies are among the core data that physicians must keep 
in mind when making any decision on patient care. This 


tions clinicians routinely ask are often the ones 
that are difficult to answer from perusal of a 
typical medical record. Common questions 
include whether a specific test has ever been 
performed, what kinds of medications have 
been tried, and how the patient has responded 
to particular treatments (e.g., a class of medi- 
cations) in the past. Physicians constantly ask 
these questions as they flip back and forth in 
the chart searching for the facts needed to sup- 
port or refute a given hypotheses as their think- 
ing about the patient evolves. On demand 
search tools help clinicians locate and then 
organize relevant patient data, into flowsheets, 
problem oriented displays (see section 
“> Timeline Flowsheets”) or graphs (Fafch- 
amps et al. 1991; Tang et al. 1994a; Starren and 
Johnson 2000) to facilitate provider assim- 
ilation of the relevant facts. 


14.3.2 Computerized Provider 
Order Entry 


One of the most important components of an 
EHR is computerized provider order entry 


one-page screen provides an instant display of core clin- 
ical data elements as well as reminders about required 
preventive care. (Source: Courtesy of Epic Systems, 
Madison, WI) 


(CPOE), where clinicians make and act upon 
therapeutic and diagnostic decisions by enter- 
ing orders. CPOE systems can reduce errors 
and costs compared to paper systems, in 
which orders are transcribed manually from 
one paper form (e.g., the orders section of the 
paper chart) to another (e.g., the nurse’s work 
list, a laboratory request form), or faxed to a 
receiving area for fulfillment (e.g., transporta- 
tion services, pharmacy). CPOE orders pass 
electronically from the decision-maker to the 
order filler with minimal to no manual labor. 
Order entry systems also provide opportuni- 
ties to deliver CDS when providers are enter- 
ing orders and making clinical decisions. Most 
existing CPOEs provide alerts about drug 
interactions, allergies, and dosing adjustments 
for renal insufficiency when new drug orders 
are entered. However, EHR implementers 
should be selective about which alerts to evoke 
and be parsimonious with the use of interrup- 
tive alerts to avoid wasting provider time on 
trivial or low-likelihood outcomes (Miller et 
al. 2005a; Phansalkar et al. 2012a, b). We dis- 
cuss this capability in more fully in the next 
section. 
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(ten fluid requirement lo | mikg/day 
(not including lipids) 
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Central Line TPN Order Sheet 
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Copyright (C) 2005, Vanderbilt University Medical Center 


|Special Instructions to Pharmacy: [ 


O Fig. 14.10 Neonatal Intensive Care Unit (NICU) 
Total Parenteral Nutrition (TPN) Advisor provides 
information about complex interactive advice and 
performs various calculations in response to the 


Order entry systems can also remind pro- 
viders about important orders, which might 
otherwise be forgotten. Very intelligently 
designed order entry systems can shrink the 
work of entering complicated orders such 
as declining dosing of prednisone, intrave- 
nous fluid orders, and total parenteral nutri- 
tion (TPN) orders the last of which require 
entry of many additives and calculations 
to avoid dangerous mixtures and to reach 
specified targets for calories each additives. 
O Figure 14.10 shows an example of a TPN 
order entry screen from Vanderbilt (Miller 
et al. 2005b). However, with some order entry 
systems, entry of intravenous and declining 
dose orders can be more difficult than with 
the manual alternative. 

When a CPOE system is operational, sim- 
ply changing the default drug or dosing based 
on the latest scientific evidence can shift order- 
ing behavior toward the optimum standard of 


provider’s prescribed goal for amount of fluid, 
calories, nutrition, and special additives. (Source: 
Miller et al. (2005b). Elsevier Reprint License No. 
2800411402464) 


care, with benefits to quality and costs. 
Because of these advantages, health care 
organizations have adopted CPOE widely and 
federal regulations for certified health IT 
require core CPOE functionality for medica- 
tions, laboratories, diagnostic imaging, and 
EHRs, which include checking for drug-drug 
and drug-allergy interactions**though some 
question the wisdom of those interaction 
requirements (Khajouei and Jaspers 2010) 
and, in one study, the use of these checks with 
alerts had no effect in the rate of adverse drug 
reactions (Nebeker et al. 2005). 


23 Certification of Health IT, Testing Process & Test 
Methods, 2015 Edition Test Method. 2015 Edition 
Certification Regulations — 170.315(a) (1, 2, 3, 4): 
Computerized Provider Order Entry and Drug-drug 
and Drug-allergy checks for CPOE » https://www. 
healthit.gov/topic/certification-ehrs/2015-edition- 
test-method (Accessed 6/4/2020). 
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14.3.3 Clinical Decision Support 


Clinical trials have shown that certain remind- 
ers from CDS can improve care processes 
(McDonald 1976; Haynes 2011; Damiani 
et al. 2010; Schedlbauer et al. 2009; Ranta 
et al. 2015; Clyne et al. 2012; Tajmir et al. 
2017) but the efficacy of CDS broadly is 
mixed (Delvaux et al. 2017; Parshuram et al. 
2018; Muth et al. 2018; Fried et al. 2017). 
EHRs can deliver CDS in batch mode at 
intervals across a whole practice population 
to identify patients who are not reaching to 
treatment targets or are past due for immuni- 
zations, cancer screening, or have missed their 
recent appointments, to cite a few examples. 
In batch mode, clinical practices can utilize 
lists of patients generated by CDS to contact 
the patient and encourage him or her to reach 
a goal or to schedule an appointment for the 
delivery of suggested care and can reach 
patients who have not kept scheduled appoint- 
ments. 

Decision support—especially for preven- 
tive care—is most efficiently delivered in the 
course of routine care while the patient and 
provider are together. Suggestions can be 
delivered during the physician order entry 
process, which in some cases can be the best 
point in the workflow at which to discourage 
or countermand an order that might be dan- 
gerous or wasteful. It is also a convenient 
point to offer reminders about needed tests or 
treatments, which can easily be initiated dur- 
ing that order session. 

One of the best ways for CDS systems to 
remind providers about tests or treatments is 
by presenting pre-constructed order(s) to the 
provider who can confirm or reject the order(s) 
with a single keystroke or mouse click. It is best 
to annotate such suggestions with their ratio- 
nale (e.g., “the patient is due for his pneumonia 
vaccine because he has emphysema and is over 
65”) so that the provider understands the ratio- 
nale for the suggestion (Mamlın et al. 2007). 

O Figure 14.11 shows some suggestions 
from a sophisticated inpatient CDS system 
developed by Intermountain Health Care. 
This system used a wide range of clinical 
information to recommend antibiotic choice, 


dose, and duration of treatment from the sys- 
tem improved clinical outcomes and reduced 
costs of infections among patients managed 
with the assistance of this system (Evans et al. 
1998; Pestotnik 2005). Vanderbilt’s inpatient 
“WizOrder” CPOE system also addressed 
antibiotic orders, as shown in @ Fig. 14.12; it 
suggests the use of Cefepine rather than 
Ceftazidine, and provides choices of dosing 
by indication. 

Clinical alerts attached to a laboratory test 
result can include suggestions for appropriate 
follow up or treatments for some abnormali- 
ties (Ozdas et al. 2008; Rosenbloom et al. 
2005). Also, CPOE functionality can warn the 
physician about allergies (@ Fig. 14.13a) and 
drug interactions (@ Fig. 14.13b) before the 
provider completes a medication order, as 
exemplified by screenshots from Partner’s out- 
patient medical record orders. 

Reminders and alerts are employed widely 
in outpatient care. Indeed, the outpatient set- 
ting is where the first study of clinical remind- 
ers and the first randomized trial of medical 
informatics systems, was performed 
(McDonald 1976) and remains the setting for 
the majority of such studies (Garg et al. 2005). 
Reminders to physicians in outpatient settings 
quadrupled the use of recommended vaccines 
in eligible patients compared with those who 
did not receive reminders (McDonald et al. 
2014; McPhee et al. 1991; Hunt et al. 1998; 
Teich et al. 2000). Reminder systems can also 
suggest needed tests and treatments for eligi- 
ble patients (Overhage 1997). @ Figure 14.14 
shows an Epic system screen with reminders 
to consider ordering a cardiac echocardio- 
gram and starting an ACE inhibitor—in an 
outpatient patient with a diagnosis of heart 
failure but no record of a cardiac echo- 
cardiogram or treatment with one of the most 
beneficial drugs for heart failure. 

Though the outpatient setting is the pri- 
mary setting for preventive care reminders, 
preventive reminders have also been applied 
effectively in the hospital setting (Dexter et al. 
2001). Furthermore, reminders directed to 
inpatient nurses improve preventive care even 
more than reminders directed to physicians 
(Dexter et al. 2004). 
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> A ADULT ANTIBIOTIC ASSISTANT A 


000000000 Doe, J.Q. 67Y M ROOM LDS Hospital 
Admited: 06/27/05 16:50 Diagnosis: SEPSIS 
WBC is down: Max 24hr WBC: 23.6 - Prev.: 27.5 


Renal Function is Impaired: CrCl=46 Cr is up: Max 24 Hour Cr. 1.7- Prev. 1.4 


Antibiotic Allergies: --None reported-- 
Current Antibiotics: 


Temp is down: Max 24hr Temp: 37.8 - Prev.: 38.2 


IBWeight: 77kg 


1. 06/27/05.18:24 1day LINEZOLID (ZYVOX, IV SOLN. 600. Q12hrs 
2. 06/27/05.18:24 day FLUCONAZOLE IN NS [DIFLUCAN], INJ 200. Q24hrs 
3. 06/28/05.09:12 1day ERTAPENEM (INVANZ), VIAL 1000. Q24hrs 
[ Identified Pathogens | Specimen, Site [| Collected 
| Clostridium subterminale [ Peritoneal Fluid, | 06/21/05 23:29 
| Escherichia coli Peritoneal Fluid, | 06/21/05 23:29 
Klebsiella pneumoniae Pentoneal Fluid, [ 06/21/05 23:29 
Enterococcus faecium BLneg VRE Peritoneal Fluid, | 06/21/05 23:29 
** Suggest ID consult ** 
| Therapeutic Suggestion | Dosage | Route | Interval — Comment = 
| Imipenem [ 500mg | W *ql2he Infuse over Ihr 


| Suggested Antibiotics Not Adequate, Call ID 


* Adjusted based on — s ap ee function. 
stions u ls rer 


OrganismSuscept | [Drug te [eich Explain | | Empii Abx | | Abx Hk | o Bis] pe Outpatient Models | Help | 


b 


Patient should receive IV antibiotics. 

Renal function dictates that dosage should be adjusted. 

Cultures show fungi or yeast that were not considered pathogens. 

The suggested antibiotic(s) will treat the identified anaerobes. 

Patient's vitals (Temp, WBC, Bands) do not support chest Xray: Wed Jun 22 06:14:00 MDT 2005) 
Suggest vancomycin & an aminoglycoside to empirically treat the Dx of sepsis. 

Suggest ticar/clav or imipenem due to the site of Clostridium infection. 

Prophylactic antibiotics are not suggested for this patient at this time. 

Suggest ID consult based on the complexity of this patient's condition. 


--The antibiotic suggestions should not replace clinical judgement.-- 
The electronic medical record may not contain all patient information. 


the culture results, and b disclaimers. (Source: Courtesy 
of R. Scott Evans, Robert A. Larsen, Stanley L. Pestot- 
nik, David C. Classen, Reed M. Gardner, and John 


O Fig. 14.11 Example of the main screen a from the 
Intermountain Health Care Antibiotic Assistant pro- 
gram needed. The program displays evidence of an 


infection-relevant patient data (e.g., kidney function, 
temperature), recommendations for antibiotics based on 


14.3.4 Access to Knowledge 
Resources 


Many clinical questions, whether addressed 
to a colleague or answered by searching 
through textbooks and published papers, are 
asked in the context of a specific patient 
(Covell et al. 1985). Thus, one appropriate 
time to offer knowledge resources to clini- 
cians is while they are writing notes or enter- 
ing orders for a specific patient. Clinicians 


P. Burke, LDS Hospital, Salt Lake City, UT (Larsen 
et al. 1989) © Cambridge University Press) 


typically have access to a selection of knowl- 
edge sources, which can be accessed from a 
web browser at any point in time today. Some 
are from public sources, such as the National 
Library of Medicine’s (NLM) PubMed and 
MedlinePlus, Centers for Disease Control 
and Prevention’s (CDC) vaccines and inter- 
national travel information, and Agency for 
Healthcare Research and Quality’s (AHRQ) 
National Guideline Clearinghouse. Others 
come from commercial vendors like 
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View Cefepime Fact Sheet | | 


Intramuscular order I.M. Cefepime (with Lidocaine) 


Non-standard Dose order non-standard dose of Cefepime 


Order Cefepime Start Over 


O Fig. 14.12 User ordered an antibiotic for which the 
Vanderbilt’s former inpatient “WizOrder” CPOE 
system, based on their Pharmaceuticals and 
Therapeutics (P and T) Committee input, recommended 
a substitution. This educational advisor guided clinician 


UpToDate, Micromedex, and a variety of 
electronic textbooks. Some EHRs are proac- 
tive and routinely present short informational 
nuggets adjacent to the order item that the 
clinician has chosen. Through an Infobutton 
designed to pull context-specific information, 
EHRs can also pull literature, textbook or 
other sources of information relevant to a 
particular clinical situation, and present that 
information to the clinician on the fly (Del 
Fiol et al. 2012). The Infobutton standard is 
now a core CDS functionality within certified 


Go to Pediatric Recommendations _ 


*Click* the CLOSE button to return to WizOrder without ordering cefepime 
Copyright (C) 2005, Vanderbilt University Medical Center 


Compared to ceftazidime, Cefepime has the 
following advantages: 


Simitar coverage against Pseudomonas, improved 
coverage against Enterobacter species 


Enhanced stability against inducible/derepressed 
chromosomal beta-lactamases 


Better activity against ¢ Gram-positive pathogens, 
Including Staphylococci, S. viridans, pneumococcus 


Q12 hour dosing except for empiric therapy for 
febrile neutropenia 


Go to Renal Dosing Recommendations 


Dose Example of infection being treated 
500 mg IV q12n Uncomplicated urinary tract infection 
1000 mg IV q12h Nosocomial pneumonia in ICU patient PANE #5 
1000 mg IV q8h Empiric coverage of febrile neutropenic patient 
The FDA approved a dose of 2 gm IV q8h for febrile neutropenic patients and this is preferred 
over the 1gm IV q8h dose if cefepime is given as monotherapy for this indication. 
2000 mg IV gen The 1 gm IV q8h dose has been used in the Bone Marrow Units and is appropriate for 


febrile neutropenic patients receiving other antibiotics with activity against Gram-negative aerobic 
pathogens such as aminoglycosides or quinolones. Documented infection with Pseudomonas aeruginosa 
should be treated with the higher (2 gm IV q8h) dose 


Order Ceftazidime 


© eds I ero 


through ordering an alternative antibiotic. Links to 
“package inserts” (via buttons) detailed how to prescribe 
recommended drug under various circumstances. 
(Source: Miller et al. (2005b). Elsevier Reprint License 
No. 2800411402464) 


Health IT requirements?” (see @ Fig. 14.15). 
To support this function, HL7 Version 3 
standard has produced the Context Aware 
Knowledge Retrieval Application 


24 Certification of Health IT, Testing Process & Test 
Methods, 2015 Edition Test Method, Clinical Deci- 
sion Support. 2015 Edition Certification Regula- 
tions - 170.315(a)(9): Clinical Decision Support 
> https://www.healthit.gov/test-method/clinical- 
decision-support-cds#ccg (Accessed 6/4/2020). 
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Continue Order | Cancel | 


BackToSearch | 
O Fig. 14.13 Drug-alert display screens from Partners 
former outpatient medical record application (Longitudinal 


alert for captopril, and b a drug-drug interaction between 
Medical Record, LMR). The screens show a a drug-allergy 


ciprofloxacin and warfarin. (Source: Courtesy of Partners 
Health Care System, Chestnut Hill, MA) 
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O Fig. 14.14 Example of CDS alerts to order an echocardiogram and to start an ACE inhibitor in a patient with 
diagnosed congestive heart failure. (Source: Courtesy of Epic Systems, Madison, WI) 
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‘Status: Final, Accno: 21118801189, Perfo 


O Fig. 14.15 This figure shows the use of Columbia 
University Medical Center’s Infobuttons during results 
review. Clicking on the Infobutton adjacent to the Iron 
result generates a window (image) with a menu of ques- 


(“Infobutton” standard).2> HL7 FHIR also 
has powerful CDS capabilities; namely, sup- 
port for CDS Hooks and two scripting lan- 
guages: FHIRPath”® and CQL” for algebraic 
and logical calculations (See » Chap 8). 

One idea increasingly of interest is to pro- 
vide to clinicians at the point of care opti- 
mized treatment plans or customized 
information, derived from patients “just like” 


25 HL7 Version 3 Standard: Context Aware Knowledge 
Retrieval Application (“Infobutton”), Knowledge 
Request, Release 2. » http://www.hl7.org/imple- 
ment/standards/product_brief.cfm?product_id=208 
(Accessed 6/4/2020). 

26 FHIRPath STUI Release. [Internet]. 2019 [cited 
01/29/2019]. Available from: » http://hl7.org/fhir- 
path/ (Accessed 6/4/2020). 

27 Health Level 7. Clinical quality language (CQL) 
standard. [Internet]. 2018 [eited 01/29/2019]. Avail- 
able from: » http://www.hl7.org/implement/stan- 
dards/product_brief.cfm?product_id=400 (Accessed 
6/4/2020). 
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Sponsored by 
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tions. When the user clicks on one of the questions, the 
Infobutton delivers the answers. (Source: Courtesy of 
Columbia University Medical Center, New York) 


the one in front of the clinician. Conceptually, 
this “green button” (Longhurst et al. 2014), 
like other CDS, would be accessible to clini- 
cians at the point of care and provide aggre- 
gate patient data (e.g., outcomes to a particular 
medication for disease treatment according to 
similar patient characteristics) to help support 
treatment decisions in the absence of high 
quality evidence. 


14.3.5 Care Team and Patient 
Communication 


Communication tools, that support timely 
and efficient communication between patients 
and the health care team and amongst team 
members, can enhance coordination of care 
and disease management. Patients are pro- 
vided secure online access to their EHR and 
integrated communication tools to ask medi- 
cal questions or conveniently perform other 
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clinical (e.g., renew a prescription) or admin- 
istrative tasks (e.g., schedule an appointment) 
(Tang 2003). Increasingly, the delivery of 
optimal patient care also requires multiple 
health care professionals that may cross sev- 
eral organizations; thus, it is important that 
communication among team members and 
organizations is delivered effectively, effi- 
ciently, and on time. Such communications 
usually focus on a single patient and may 
require a care provider to assess or inter- 
change information from several systems and 
providers in order to coordinate relevant care. 
Direct connectivity to patients is and will 
be increasingly important to patient-provider 
communication. It will permit direct-to- 
patient reminders (Sherifali et al. 2011) and 
deliver home health monitoring data (such as 
home blood pressure measurements and glu- 
cose testing results) to the EHR and other 
information systems (Earle 2011; Green et al. 
2008). The patient’s personal health record 
(PHR) will also become an important desti- 
nation for clinical messages and test results 
(see > Chap. 13). Relevant information can 
be “pushed” to the patient or their PHR via 
e-mail, pager services, or other secure texting 
or closed loop communication (Major et al. 
2002; Poon et al. 2002; Gulacti and Lok 2017; 
Rief et al. 2017; Przvbvlo et al. 2014) or 
“pulled” by users on demand during their 
routine interactions with the computer. 
EHRs can also provide electronic func- 
tionality to assist in the transfer (or hand-off) 
of care responsibility from one clinician to 
another. When the transfer is between settings 
(e.g., from hospital to nursing home), the 
sending clinician usually provides a brief ver- 
bal or written turnover note to the receiving 
clinician(s) summarizing the patient’s prob- 
lems, treatments, and other relevant clinical 
issues. © Figure 14.16 shows an example of a 
“turn-over report” that includes instructions 
from the “sending” physician, as well as rele- 
vant recent laboratory test results and other 
data pulled from the patient’s EHR and a “to- 
do” list, that ensures that critical tasks are 
complete (Stein et al. 2010). Such reports 
facilitate communication among team mem- 
bers, and can improve both coordination and 
patient safety. However, keeping the contents 
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of such reports up to date and accurate can be 
a challenge (Arsoniadis et al. 2017). 
Although most patient encounters are 
defined by scheduled face-to-face visits (e.g., 
outpatient visit, home health visit), provider 
decision-making also occurs during non-face- 
to-face and nonscheduled events (e.g., patient 
telephone calls, prescription renewal requests, 
and the arrival of new test results). Recently, 
CMS has indicated they will pay for another 
kind of non-face-to-face event, namely virtual 
visits.” EHRs and other Health IT will sup- 
port and facilitate these non-face-to-face 
events, and video-visit capabilities will become 
part of many EHR vendor system offerings. 
EHRs are traditionally bounded by the 
institution in which they reside. The National 
Health Information Infrastructure (NHII) 
(NCVHS 2001) has proposed a future in 
which providers caring for a patient could 
reach beyond his or her local institution to 
automatically obtain patient information 
from all relevant sources (see > Chap. 13). 
Today, examples of such regional “EHRs,” 
often referred to as Health Information 
Exchanges (HIE), serve routine and emer- 
gency care, public health and other functions. 
The first HIE was the IHIE (Indiana Health 
Information Exchange) (McDonald et al. 
2005) which started in 1994 with 3 Indianapolis 
hospitals and now includes hospitals from 
most of Indiana. Other early HIEs include 
Ontario, Canada (electronic Child Health 
Network),” Kentucky (Kentucky Health 
Information Exchange), and Memphis 
(Frisse et al. 2008) Today, scores of HIEs are 
in operation > https://strategichie.com/mem- 
bership/member-list/, A study from the 
Memphis HIE showed that the extra patient 
information provided by this HIE saves 
resource use and costs (Frisse et al. 2011). 


28 » https://www.cms.gov/outreach-and-education/ 
medicare-learning-network-mIn/mInproducts/ 
downloads/telehealthsrvcsfctsht.pdf (Accessed 
6/4/2020). 

29 eCHN electronic Child Health Network. » http:// 
www.echn.ca/ (Accessed 6/4/2020). 

30 Kentucky Health Information Exchange Frequently 
Asked Questions. » http://khie.ky.gov/Pages/faq. 
aspx?fc=010 (Accessed 6/4/2020). 
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oe tot EN 
Enter Data | Print Report | About | 


Pt is a 86 yo M with PMH of CAD s/p „AS s/p AVR, severe OCP, and 7 
mo hx of wheezing presents with cough, wheezing, and dyspnea 
for 2 d. Pt was initially 98% RA and doing well but then acutely 
desaturdated. Has continued to have moderate-to-high suction 
requirements today. 


f/u blood cx 
--abnormal 


| To-Do List 


| [] PA Transport for CT Head 


] Follow up on PM Iytes/labs. Replete as needed. 
[ ] PM PTT 


Right-sided pacemaker with lead in the right ventricle. The ** pan culture, CXR if spikes 
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patient is status post median sternotomy and CABG. 

Evaluation of the lower neck and superior mediastinum are 
limited by the patient's body habitus. No significant axillary, 
mediastinal, or hilarlymphadenopathy is identified though 
evaluation is limited by the lack of intravenous contrast and body 
habitus. The heart is enlarged. No pericardial effusion is 
visualized. There are no pleural effusions. 


O Fig. 14.16 Patient handoff report—a user- 
customizable hard copy report with automatic inclusion 
of patient allergies, active medications, 24-h vital signs, 
recent common laboratory test results, isolation require- 
ments, code status, and other EHR data. This system 


According to a December 2018 ONC report, 
half of all US hospitals now share some data 
through one or more HIEs.*! 

In 2010, the Office of the National 
Coordinator (ONC) proposed the Nationwide 
Health Information Network (NWHIN) to 
connect regional HIEs and promote health 
data exchange (see > Chaps. 13 and 31).*? It 


31 » https://www.ruralcenter.org/resource-library/ 
methods-used-to-enable-interoperability-among- 
us-non-federal-acute-care-hospitals (Accessed 
6/4/2020). 

32 eHealth Exchange. » https://ehealthexchange.org/ 
(Accessed 6/4/2020). 


was developed by a customer within a vendor EHR 
product (Sunrise Clinical Manager, Allscripts, Chicago, 
IL) and was disseminated among other customers 
around the nation. (Source: Courtesy of Columbia Uni- 
versity Medical Center, New York) 


included NHIN Connect and NHIN Direct, 
the former with a special focus on large enter- 
prises and government organizations, the 
latter with a focus on simpler or local net- 
works. The Direct protocol** is an email 
(SMTP)-based protocol designed to deliver 
encrypted messages and attachments securely 
among pre-arranged groups of individuals or 
organizations. ONC has nurtured the devel- 
opment of the Direct Protocol and required 


33 » https://www.healthdatamanagement.com/news/ 
connect-nhin-direct-what-are-they (Accessed 
6/4/2020). 

34 » http://wiki.directproject.org/Main_Page 
(Accessed 6/4/2020). 
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EHR vendors to support it, and they have, but 
minimally in some cases. 

A least seven organizations, provide gen- 
eral tools and trust policies to create new 
secure health care networks or connect exist- 
ing ones. Some of them are direct descendants 
of NHIN Connect. These seven are described 
and compared in an excellent 2017 ONC 
report.*> Among the seven, Direct Trust and 
National Association for Trusted Exchange 
(NATE) use the Direct Protocol exclusively, 
and Surescripts and eHealth Exchange use it 
optionally. The other three use a mix of non- 
Direct Protocols. Some of these organizations 
are discussing partnerships or mergers. Unfor- 
tunately, support for delivering structured 
data is limited in many of these systems, and 
EHR vendors have not chosen a common 
approach. 

With the goal of one national network of 
networks, ONC has proposed an overarching 
set of policies and protocols, called “The 
trusted exchange framework and common 
agreement” (TEFCA).*° It has been met with 
both praise and criticism. FHIR offers addi- 
tional mechanisms for linking independent 
networks and is highlighted in the TEFCA 
proposal. To see how this all evolves, stay 
tuned. 


14.3.6 Billing and Coding 


While originally billing and coding systems 
were most often separate from the main EHR 
which served as a clinical system of record, 
over time this has changed. Today, the major- 
ity of vendor EHRs have billing and coding 
functionality, as well as other aspects of reve- 
nue cycle (e.g., prior authorization, accounts 
receivable). Tying the various aspects of reve- 
nue cycle functionality into EHRs has proven 
to have several efficiencies for health care 
organizations. For instance, the workflow for 


35 » https://www.healthit.gov/sites/default/files/analy- 
sis_of_existing_trust_arrangements_printable.pdf 
(Accessed 6/4/2020). 

36 » https://www.healthit.gov/sites/default/files/ 
page/2019-04/FINALTEFCAQTF41719508version. 
pdf (Accessed 6/4/2020). 
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coders is more streamlined. The same system 
can be used to review records, code, and pro- 
vide feedback to clinicians on documentation, 
as well as allow clinicians to drop charges 
directly into the EHR. In some cases, EHR 
coding and billing can also be augmented 
with 3rd party systems, such as those leverag- 
ing computer-assisted coding with clinical 
NLP technologies (see > Chap. 9). 


14.4 EHRs for Secondary 
and Population-Based Uses 


This section considers secondary uses of 
EHRs, which are increasing greatly. Some of 
this functionality can be done by or in concert 
with other systems and platforms. We expect 
that EHRs will continue to evolve and trans- 
form as information systems over time with 
greater analysis and secondary use functions. 
Medical personnel, quality and patient safety 
professionals, and administrators use these 
capabilities to find particular patterns and 
events that predict patient outcomes. Public 
health professionals can use reporting func- 
tions of computer-stored records for surveil- 
lance, including looking for emergence of new 
diseases or other health threats that warrant 
medical attention. 


14.4.1 Population-Based 


Clinical Care 


Although the functions of CDS for a single 
patient on the one hand and across a larger 
patient population on the other are different, 
their internal logic is similar. In both, the cen- 
tral procedure is to determine if a single 
patient at hand or which across the full set or 
subset of patients satisfy pre-specified criteria 
and to act appropriately when the patient 
meets those criteria. Surveillance queries gen- 
erally address a large subset, or all, of a 
patient population; the output is often a tabu- 
lar report of selected raw data on all the 
patient records retrieved or a statistical sum- 
mary of the values contained in the records. 
Decision support systems usually address 
patients who are under active care and 


14 


496 G.B.Melton et al. 


generate an alert or reminder message 
(McDonald 1976) to that patient. 
Organizations can use these systems at a pop- 
ulation level for care coordination, patient 
empanelment for primary care providers, and 
other tasks. 

For example, a cross-population query can 
be used to identify patients who are due for 
periodic screening examinations such as immu- 
nizations, mammograms, and cervical Pap tests 
and then can generate letters to patients or call 
lists for office staff to encourage the preventive 
care. This can also be especially useful for con- 
ducting ad hoc searches such as those required 
to identify and notify patients who have been 
receiving a recalled drug. Such systems can also 
facilitate quality management and patient 
safety activities, identify candidate patients for 
concurrent review and gather many of the data 
required to complete such audits. 


14.4.2 Clinical Research 


Researchers can use EHRs particularly asso- 
ciated cross-patient queries and alerting capa- 
bilities to identify patients who meet or have a 
high chance of meeting eligibility require- 
ments for a prospective clinical trial. For 
example, an investigator could identify all 
patients seen in a medical clinic who have a 
particular diagnosis and satisfy the eligibility 
requirements specified in a given study proto- 
col (Kho et al. 2007). These approaches can 
sometimes be applied in real time. At one 
institution, the physician’s workstation was 
programmed to ask permission to invite the 
patient into a study, when that physician 
entered a problem that suggested the patient 
might be a candidate for a local clinical trial. 
If the physician gave permission, an auto- 
mated electronic page could then be triggered 
and sent to the nurse recruiter who would 
then invite the patient to participate in the 
study. One early such study was for patients 
with back pain (Damush et al. 2002). 
Randomized prospective studies are the 
gold standard for clinical investigations, but 


retrospective studies of existing data have con- 
tributed much to medical progress (see 
> Chap. 29). Retrospective studies can also 
obtain answers at a small fraction of the time 
and cost of comparable prospective studies. 
EHRs can often provide much of the data 
required for a retrospective study. They can, 
for example, identify study cases and compa- 
rable control cases, and provide data needed 
for statistical analysis of the comparison cases 
(Brownstein et al. 2007). Combined with 
access to discarded specimens, they also offer 
powerful approaches to retrospective genome 
association studies that researcher can do 
much faster and at fraction of the comparable 
prospective studies (Kohane 2011; Roden 
et al. 2008). 

Computer-stored records do not eliminate 
all the work required to complete an epide- 
miologic study; chart reviews and patient 
interviews may still be necessary. Computer- 
stored records are likely to be most complete 
and accurate with respect to visit diagnoses 
that are carefully coded for administrative 
purpose, as well as to prescribed drugs, and 
laboratory tests, because the latter two usually 
come directly from automated laboratory and 
pharmacy systems, respectively. Consequently, 
computer-stored records are likely to contrib- 
ute to research on a physician’s practice pat- 
terns, on the efficacy of tests and treatments, 
and on the toxicity of drugs. The research 
opportunities will only improve with FHIR 
API and coding standards required by pro- 
posed CMS and ONC rules. NIH and AHRQ 
have both encouraged research interest in 
FHIR 8 Also, improvements in NLP tech- 
niques may make the content of narrative text 
more accessible to automatic searches (see 
> Chap. 9). 


37 » https://grants.nih.gov/grants/guide/notice-files/ 
NOT-HS-19-020.html (Accessed 6/4/2020). 

38 » https://grants.nih.gov/grants/guide/notice-files/ 
NOT-OD-19-122.html (Accessed 6/4/2020). 
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14.4.3 Quality Reporting 


EHRs are also increasingly important in the 
production or autonomation of quality 
reports that are used for both internal quality 
improvement activities and for external regu- 
latory or public reporting. Although it is dif- 
ficult for paper-based records to incorporate 
patient-generated input and it requires careful 
tagging of data sources, EHRs can increas- 
ingly include data contributed by patients 
(e.g., patient-reported outcomes such as func- 
tional status, pain scores, and symptom; 
review of systems). Patient-reported data is 
also being incorporated in future quality mea- 
sures particularly for disease specific or epi- 
sodic specific conditions, e.g., use of the 
Seattle Angina Questionnaire Short Form 
(SAQ-7) and Rose Dyspnea Scale (RDS) fol- 
lowing Non-Emergent Percutaneous 
Coronary Intervention (PCI).>? 

With changing reimbursement payment 
models focusing more on outcomes measures 
instead of volume of transactions, generating 
efficient and timely reports of clinical quality 
measures will play an increasingly important 
role in management and payment. FHIR will 
likely play a role here, as well. FHIR supports 
the Clinical Quality Language (CQL)* and 
FHIRPath,*! a subset CQL. CQL was devel- 
oped for CMS quality reporting but has wider 
applicability to other initiatives. 


14.4.4 Administration 


In the past, administrators had to rely on data 
from billing systems to understand practice 
patterns and resource utilization. However, 
claims data have their limits, including their 
delayed and retrospective nature. From direct 
comparisons between medical record content 


39 » https://cmit.cms.gov/CMIT_public/ 
ViewMeasure?Measureld=3516 (Accessed 6/4/2020). 

40 > https://ecqi.healthit.gov/cql-clinical-quality-lan- 
guage (Accessed 6/4/2020). 

41 > https://www.hl7.org/fhir/fhirpath.html (Accessed 
6/4/2020). 
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and diagnoses coded from that record, we 
know that the accuracy of coding varies by 
kind of diagnosis, setting and hospital size, 
and variability in granularity. Claims data can 
also provide structured and coded records of 
ambulatory prescriptions, but generally pro- 
vide no test results or clinical measurement 
values. So, considering only claims-based 
diagnostic codes can lead to inappropriate 
policymaking and conclusions (Tang et al. 
2007). 

EHRs complement claim- and adminis- 
trative-based data and can provide informa- 
tion about the relationships among diagnoses, 
severity of illness indication, and resource 
consumption. Thus, these systems are impor- 
tant tools for administrators who wish to 
make informed decisions in the increasingly 
value-based world of health care. On the 
other hand, the use of EHR data for billing 
and administrative purposes can incentiv- 
ize clinicians to bias their documentation 
for maximal payment, and possibly reducing 
the clinical accuracy of the diagnoses. It may 
therefore be best to base financial decisions 
on variables that are not open to interpreta- 
tion. 

Despite Reiser’s (1991) clinically oriented 
goals for EHRs, much of what these systems 
are currently is driven by complex and pre- 
scriptive medical-legal, reimbursement, and 
regulatory requirements (Cusack et al. 2013). 
These requirements may lead to redundant 
data capture, cumbersome documentation 
processes, and information that is biased 
towards optimized billing. One potential solu- 
tion would be policy changes such that the tie 
between payment and documentation ele- 
ments are less emphasized, including the pro- 
posed CMS rule “Patients Over Paperwork”? 
which features decreased documentation bur- 
den for office visits. 


42 » CMS.gov. Patients Over Paperwork. » https:// 
www.cms.gov/About-CMS/story-page/patients- 
over-paperwork.html (Accessed 6/4/2020). 
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14.5 Challenges Ahead 


Although many commercial products are 
labeled as EHRs, some do not satisfy all the 
criteria that we defined at the beginning of 
this chapter. Even beyond matters of defini- 
tion, however, it is important to recognize that 
the concept of an EHR is neither unified nor 
static. As the capability of technology evolves, 
the function of the EHR will expand. Certified 
health IT specifications appear also to have 
pushed forward certain types of core EHR 
functionality. 

Greater involvement of patients in their 
own care, for example, means that personal 
health records (PHRs) will increasingly incor- 
porate data captured at home and also sup- 
port two-way communication between 
patients and their health care team (see also 
> Chap. 13). The potential for patient-entered 
data includes history, symptoms, and out- 
comes entered by patients as well as data 
uploaded automatically by home monitoring 
devices such as scales, blood pressure moni- 
tors, glucose meters, pulmonary function 
devices, smart phones and Fitbits. By inte- 
grating these patient-generated data into the 
EHR, either by uploading the data into the 
EHR or by linking the EHR and the PHR, a 
number of long-term objectives can be 
achieved: patient-generated data may in some 
circumstances be more accurate or complete, 
the time spent entering data during an office 
visit by both the provider and the patient may 
be reduced, and the information may allow 
the production of outcomes measures that are 
better attuned to patients’ goals. Patient- 
delivered data will be welcome when this 
information has been requested by the prac- 
tice (e.g. initial visit history check list), and a 
mutual understanding exists about the types 
and volumes of data that can be accepted and 
delays between receipt and review.** 

The future of EHRs depends on both 
technical and nontechnical considerations. 


43 We have included examples from various systems in 
this chapter, both developed by users and commer- 
cially available, to illustrate a portion of the func- 
tionality of EHRs currently in use. 


Computing technology will continue to 
advance, with processing power doubling 
every 1.5 years according to Moore’s law (see 
> Chap. 1). Software will improve with more 
powerful applications, better user interfaces, 
and more integrated CDS, including CDS 
using third party solutions and CDS integra- 
tion specifications (eg., FHIR®©, CDS 
hooks“). New kinds of software that support 
collaboration will continue to improve; social 
media are growing rapidly both inside and 
outside of health care. For example, as both 
providers and patients engage increasingly in 
social media, new ways to capture data, share 
data, collaborate, and share expertise may 
emerge. Perhaps the greater need for leader- 
ship and action will be in the social and orga- 
nizational foundations that must be laid if 
EHRs are to serve as the information infra- 
structure for health care. We touch briefly on 
some of these challenges in this final section. 


14.5.1 Usability 


An intuitive and efficient user interface is an 
important desired characteristic of an 
EHR. Designers must understand the cogni- 
tive aspects of the human computer interac- 
tion (HCI) and each of the various workflows 
if they are to build user interfaces that are 
easy-to-learn and easy-to-use (see > Chap. 5). 
Improving these systems using best practices 
and principles of HCI will require changes 
not only in how the system behaves but also in 
how humans interact with the system. 

User interface requirements of a nurse 
entering patient data in the inpatient setting 
are different from the requirements of a clerk 
entering patient charges. Usability for clini- 
cians means fast computer response times, 
and the fewest possible data input fields. A 
system that is slow or requires too much input is 
not usable by clinicians, particularly in the time- 
constrained setting of clinical care. The menus 
and vocabularies that constrain input must 


44 CDS hooks HL7 group. > http://wiki.hl7.org/ 
index.php?title=201809_CDS_Hooks (Accessed 
6/4/2020). 
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include synonyms for all the ways health pro- 
fessionals name the items, and the system 
must have keyboard options for all inputs and 
actions because switching from mouse to key- 
board steals user time. 

What information the provider needs and 
what tasks the provider performs should influ- 
ence what information the EHR presents and 
how the system presents it. Development of 
technology that matches the data-processing 
power of computers with the cognitive capa- 
bility of human beings to formulate insightful 
questions and to interpret data is still a rate- 
limiting step (Tang and Patel 1994). For 
example, one can imagine an interface in 
which speech input, typed narrative, and 
mouse-based structured data entry are 
accepted and seamlessly stored into a single 
data structure within the EHR, with a hybrid 
user display that shows both a narrative ver- 
sion of the information and a structured ver- 
sion of the same information that highlights 
missing fields or inconsistent values. Along 
these lines, @ Fig. 14.17 shows a historical 
example of order generation by the Gopher 3 
system in operation at Eskanazi Hospital 
(a.k.a. Wishard hospital) clinics. Physicians 
would write their problem in narrative text. 
Using NLP methods in parallel, the computer 
would generate a list of code orders that it 
inferred from these notes. Users could then 
confirm order(s) with simple click(s) and add 
any further details required to complete each 
order (Duke et al. 2014). This same function- 
ality is now provided through the Regenstrief 
Clinical Learning system — a realistic medical 
record system with rich sample patient data 
available for teaching medical students about 
EHR functionality.* 


14.5.2 Standards 


We alluded to the importance of standards 
earlier in this chapter, when we discussed the 
architectural requirements of integrating data 
from multiple sources. Standards are the focus 


45 » https://www.regenstrief.org/resources/clinical- 
learning/ (Accessed 6/4/2020). 
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of > Chap. 8. Here, we stress the importance 
of national standards to the development, 
implementation, and use of EHRs (Miller 
and Gardner 1997). Standards are especially 
important for integrating clinical data from 
different organizations. Health information 
exchanges (HIEs) continue to expand in size 
and numbers but the healthcare systems that 
feed them will have adopt meaningful use 
coding and messages/API structure standards 
more fully than before. HIEs will be able to 
efficiently import and integrate structured 
data about one patient from many organiza- 
tions. Messaging and API standards are 
increasingly well developed and in widespread 
use for laboratory data* (HL7s LRI), pre- 
scriptions sent to pharmacies*? (NCPDPs 
SCRIPT stands), many kinds of diagnostic 
(DICOM) images. FHIR®© is now sup- 
ported by major federal agencies (ONC, 
CMS, NIH, CDC, AHRQ and increasingly 
by the FDA) as well as by the high-tech indus- 
try (Apple, Amazon, Google and Microsoft) 
and health care software developers. It is now 
mainstream for many in healthcare applica- 
tions and communications. The incomplete 
adoption of standard coding systems for 
observation identifiers, however, remains a 
major obstacle to the integration of patient 
data from independent care providers and 
large care delivery systems alike. 

The HIPAA legislation’? includes man- 
dated standards for administrative messages 
(x12) privacy, security, and clinical data. 
Federal agencies have already promulgated 
regulations based on this legislation for the 
first three of these categories.” A series of 
legislative measures, notably with the 2009 


46 » https://www.lri.fr/presentation_en.php (Accessed 
6/4/2020). 

47 » https://www.ncpdp.org/NCPDP/media/pdf/NCP- 
DPEprescribingBasics.pdf (Accessed 6/4/2020). 

48 » https://searchhealthit.techtarget.com/definition/ 
DICOM.-Digital-Imaging-and-Communications-in- 
Medicine (Accessed 6/4/2020). 

49 » https://www.edibasics.com/edi-resources/docu- 
ment-standards/hipaa (Accessed 6/4/2020). 

50 HIPAA for professionals. » https://www.hhs.gov/ 
hipaa/for-professionals/index. html (Accessed 6/4/2020). 
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Release Orders 
a © @ Diagnoses Orders —> Allergies —> Procedures —> Observations —> Bin —> Sign 
Note Entry 


Insert Template Save Template Recent Notes 


1) DM - due for Aic, optho consult, lytes. 


4) Screening- need colonoscopy, PSA 


u u u m m mu m, 


4) Screening- need colonoscopy, PSA 


provider 


O Fig. 14.17 An example of NLP and rule-based 
conversion of provider notes to order items from Blain 
Takasu. Notes written in free text in the A/P section are 


HITECH Act?! (see > Chaps. 8 and 31) and 
subsequently with the Medicare Access and 
CHIP Reauthorization Act of 2015 
(MACRA),’” have stimulated significant 
efforts to increase the adoption and function- 
ality of EHRs, as well as leverage these sys- 
tems for quality reporting. More recently, the 
twenty-first Century Cure Act has significant 
provisions around delivery of Health IT 
usability (Sect. 4001), Conditions of 
Certification (Sect. 4002(a)), Trusted 
Exchange Framework and Common Agree- 
ment (Sect. 4003(b)), and guidelines around 
reasonable and necessary activities that do 
not constitute information blocking (Sect. 


51 » https://www.healthit.gov/sites/default/files/ 
hitech_act_excerpt_from_arra_with_index.pdf 
(Accessed 6/4/2020). 

52 » CMS.gov: MACRA. > https://www.cms.gov/ 
Medicare/Quality-Initiatives-Patient-Assessment- 
Instruments/Value-Based-Programs/MACRA- 
MIPS-and-APMs/MACRA-MIPS-and-APMs.html 
(Accessed 6/4/2020). 


2) HTN increase lisinopril, get EKG, consider ECHO 


3) Hyperlipidemia - overdue for lipids, get today 


Zoomed in view of the patient's first 
four problem notes as written by 


Possible Order Matches a |x 


Lisinopril (i 


Prostate Specific Ag (psa) 
Consults 


ee e e o o o o ee 


Ordered items extracted 
from the problem notes 
automatically via NLP 


inferred by NLP analysis to suggest possible matches in 
the order items box 


4004)” and these positions are present in 
proposed ONC and CMS rules. 


14.5.3 Costs and Benefits 


The National Academy of Medicine (for- 
merly Institute of Medicine) declared the 
EHR an essential infrastructure for the deliv- 
ery of health care, and the protection of 
patient safety (IOM Committee on Improving 
the Patient Record 2001). Like any infrastruc- 
ture project, the benefits specifically attribut- 
able to infrastructure are not immediate and 
sometimes difficult to establish; an infrastruc- 
ture plays an enabling role in all projects that 
take advantage of it. Early randomized con- 
trolled clinical studies showed that computer- 
based decision-support systems reduce costs 
and improve quality compared with usual 
care supported with a paper medical record 


53 H.R.34 — 114th Congress. > https://www.congress. 
gov/bill/1 14th-congress/house-bill/34 (Accessed 
6/4/2020). 
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(Tierney et al. 1993; Bates et al. 1997, 2003; 
Classen et al. 1997), and meta-analyses of 
Health IT have demonstrated quality benefits 
(Buntin et al. 2011; Lau et al. 2010; Clyne 
et al. 2012). However, others have not found 
consistent associations between EHRs and 
CDS and better quality (Romano and Stafford 
2011; Delvaux et al. 2017; Parshuram et al. 
2018; Muth et al. 2018). 

Because of the significant resources needed 
and the significant broad-based potential ben- 
efits of these systems, the decision to imple- 
ment an EHR is a strategic one for most 
healthcare organizations. Hence, the evalua- 
tion of the costs and benefits must consider 
the effects on the organization’s strategic 
goals, as well as the objectives for individual 
health care (Samantaray et al. 2011). Today, 
there are a number of Open Source options 
for EHR software with a range of capabilities 
(Syzdykkova et al. 2017). 

The cost of installing an EHR in a large 
health system can exceed $100 million and 
even $1 billion for the largest imple- 
mentations.’* The cost of the system itself in 
license fees and related items is usually only a 
portion of that number. Other costs include 
configuration, training, and lost revenue as 
care providers learn to use the system. The 
benefits of such an investment are often 
related to the integration of a health system’s 
diverse components into a single, coordinated 
enterprise. 


14.5.4 Leadership 


Leaders from all segments of the health care 
industry must work together to articulate 
the needs, to continue to define and expand 
upon the standards, to fund the develop- 
ment, to implement the social change, and 
to write the laws to accelerate the develop- 
ment and routine use of EHRs in health 
care. Because of the prominent role of the 


54 EHR Intelligence. Top 5 Most Expensive Imple- 
mentations of 2017. » https://ehrintelligence.com/ 
news/top-5-most-expensive-ehr-implementations- 
of-2017 (Accessed 6/5/2020). 
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federal government in health care—as a 
payer, provider, policymaker, and regula- 
tor—federal leadership to create incentives 
for developing and adopting standards and 
for promoting the implementation and use 
of EHRs remains crucial. Technological 
change will continue to occur at a rapid 
pace, driven by consumer demand for enter- 
tainment, retail, games, and business tools. 
Nurturing the use of IT in health care 
requires leaders including informatician 
leaders who promote the use of EHRs and 
work to overcome the obstacles that impede 
widespread use of computers for the benefit 
of health care. 
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This text focuses on the design, evaluation, 
and application of Clinical Decision Support 
systems, and examines the impact of com- 
puter-based diagnostic tools both from the 
practitioner’s and the patient’s perspectives. It 
is designed for informatics specialists, teachers 
or students in health informatics, and clini- 
cians. 

Collen, M. F. (1995). A history of medical infor- 

the United States, 1950-1990. 

Indianapolis: American Medical Informatics 

Association, Hartman Publishing. This rich 

history of medical informatics from the late 

1960s to the late 1980s includes an extremely 
detailed set of references. 


matics in 
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Collen, M. F., & Ball, M. J. (Eds.). (2015). The his- 


tory of medical informatics in the United States 
(2nd ed.). London: Springer-Verlag, Springer 
Nature. This rich history of medical informat- 
ics from the 1990s to mid 2010s (25 years) pro- 
vides an updated medical informatics 
perspective. It includes an extremely detailed 
set of references. 


Hartley, C. P., & Jones, E. D. (2011). EHR imple- 


mentation: A step-by-step guide for the medical 
practice (2nd ed.). Chicago: American Medical 
Association. This book provides rich details 
for implementing an EHR. It is a great 
resource for anyone trying to learn about 
EHR deployments, covering topics related to 
preparation, support, and implementation. 


Institute of Medicine (IOM) Roundtable on Value 


and Science-Driven Health Care. (2011). 
Digital infrastructure for the learning health 
system: The foundation for continuous improve- 
ment in health and health care — workshop 
series summary. Washington, DC: National 
Academy Press. This report summarizes three 
workshops that presented new approaches to 
the construction of advanced medical record 
system that would gather the crucial data 
needed to improve the health care system. 


Kuperman, G. J., Gardner, R. M., & Pryor, T. A. 


(1991). The HELP system. Berlin/Heidelberg: 
Springer-Verlag GmbH and Co. K. The HELP 
(Health Evaluation through Logical 
Processing) system was a computerized hospi- 
tal information system developed by the 
authors at the LDS Hospital at the University 
of Utah, USA. It provided clinical, hospital 
administration and financial services through 
the use of a modular, integrated design. This 
book thoroughly documents the HELP sys- 
tem. Chapters discuss the use of the HELP 
system in intensive care units, the use of 
APACHE and APACHE II on the HELP sys- 
tem, various clinical applications and inactive 
or experimental HELP system modules. 
Although the HELP system has now been 
retired from routine use, it remains an impor- 
tant example of several key issues in EHR 
implementation and use that continue in the 
commercial systems of today.. 


Osheroff, J., Teich, J., Levick, D., et al. (2012). 


Improving outcomes with clinical decision sup- 
port: An implementers guide (2nd ed.). 


Scottsdale: Scottsdale Institute, AMIA, 
AMDIS and SHM. This text provides guid- 
ance on using clinical decision support inter- 
ventions to improve care delivery and 
outcomes in a hospital, health system or phy- 
sician practice. The book also presents consid- 
erations for health IT software suppliers to 
effectively support their CDS implementer cli- 
ents. 


Sittig, D. F., & Ash, J. S. (2011). Clinical informa- 


tion systems: Overcoming adverse conse- 
quences (Jones and Bartlett series in 
biomedical informatics) (1st ed.). Burlington: 
Jones and Bartlett Learning. This book 
explores the challenges and obstacles with 
implementation of clinical information sys- 
tems including the nine categories of unin- 
tended adverse consequences with 
implementation and optimization of these 
systems as well as best practices. 


Weed, L. L. (1969). Medical records, medical eval- 


uation and patient care: The problem-oriented 
record as a basic tool. Chicago: Year Book 
Medical Publishers. In this classic book, Weed 
presents his plan for collecting and structuring 
patient data to produce a problem-oriented 
medical record. 


® Questions for Discussion 


1. What is the definition of an EHR? 
What, then, is an EHR? What are five 
advantages of an EHR over a 
paper-based record? Name three 
limitations of an EHR. 

2. What are the five functional compo- 
nents of an EHR? Think of the infor- 
mation systems used in health care 
institutions in which you work or that 
you have seen. Which of the compo- 
nents that you named do those systems 
have? Which are missing? How do the 
missing elements limit the value to the 
clinicians or patients? 

3. Discuss three ways in which a computer 
system can facilitate information trans- 
fer between hospitals and ambulatory 
care facilities, thus enhancing continu- 
ity of care for previously hospitalized 
patients who have been discharged and 
are now being followed up by their pri- 
mary physicians. 
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10. 


11. 


Much of medical care today is prac- 
ticed in teams, and coordinating the 
care delivered by teams is a major chal- 
lenge. Thinking in terms of the EHR 
functional components, describe four 
ways that EHRs can facilitate care 
coordination. Describe two ways in 
which EHRs are likely to create addi- 
tional challenges in care coordination. 

How does the health care financing envi- 
ronment affect the use, costs, and bene- 
fits of an EHR? How has the financing 
environment affected the functionality 
of information systems? How has it 
affected the user population? 

Would a computer scan of a paper- 
based record be an EHR? What are two 
advantages and two limitations of this 
approach? 

Among the key issues for designing an 
EHR are what information should be 
captured and how can it be entered into 
the system. Physicians may enter data 
directly or may record data on a paper 
worksheet (encounter form) for later 
transcription by a data-entry worker. 
What are two advantages and two dis- 
advantages of each method? Discuss 
the relative advantages and disadvan- 
tages of entry of free text instead of 
entry of fully coded information. 
Describe an intermediate or compro- 
mise method. 

EHR data may be used in clinical 
research, quality improvement, and 
monitoring the health of populations. 
Describe three ways that the design of 
the EHR may affect how the data may 
be used for other purposes. 

Identify four locations where clinicians 
need access to the information con- 
tained in an EHR. What are the major 
costs or risks of providing access from 
each of these locations? 

What are three important reasons to 
have physicians enter orders directly 
into an EHR? What are three chal- 
lenges in implementing such a system? 

Consider the task of creating a sum- 
mary report for clinical data collected 
over time and stored in an EHR. Clinical 
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laboratories traditionally provide sum- 
mary test results in flowsheet format, 
thus highlighting clinically important 
changes over time. A medical record 
system that contains information for 
patients who have chronic diseases 
must present serial clinical observa- 
tions, history information, and medica- 
tions, as well as laboratory test results. 
Suggest a suitable format for presenting 
the information collected during a 
series of ambulatory-care patient visits. 

12. The public demands that the 
confidentiality of patient data must 
be maintained in any patient record 
system. Describe three protections 
and auditing methods that can be 
applied to paper-based systems. 
Describe three technical and three 
nontechnical measures you would 
like to see applied to ensure the 
confidentiality of patient data in 
an EHR. How do the risks of 
privacy breaches differ for the two 
systems? 
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© Learning Objectives 

After reading this chapter you should know 

the answers to these questions: 

= What is the vision and purpose of Health 
Information Infrastructure (HII)? 

= What kinds of impacts will HII have, 
and over what time periods? 

= How do HII requirements lead to 
effective architectural specifications? 

= What are the political and technical 
barriers to HII implementation? 

= How can HII progress be effectively 
evaluated? 


15.1 Introduction 


This chapter addresses health information 
infrastructure (HII), community level infor- 
matics systems designed to make comprehen- 
sive electronic patient records available when 
and where needed for the entire population. 
There are numerous difficult and highly inter- 
dependent challenges that HII systems must 
overcome, including privacy, stakeholder 
cooperation, assuring all-digital information, 
and providing financial sustainability. As a 
result, while HII has been pursued for years 
with myriad approaches in many countries, 
progress has been slow and no proven formula 
for success has yet been identified. 

While the discussion here is focused on the 
development of the HII in the United States, 
many other countries are involved in similar 
activities and in fact have progressed further 
along this road. Canada, Australia, and a 
number of European nations have devoted 
considerable time and resources to their own 
national HIIs. A few countries, such as 
Finland, Estonia, and Brazil have actually 
succeeded in developing effective HII systems 
that have been working nationwide for a num- 
ber of years. It should be noted, however, that 
all of these nations have centralized, 
government-controlled healthcare systems. 
This organizational difference from the multi- 
faceted, mainly private healthcare system in 
the U.S. results in a somewhat different set of 
issues and problems. One can hope that the 
lessons learned from HII development activi- 


ties across the globe can be effectively shared 
to ease the difficulties of everyone who is 
working toward these important goals. 

HII at first seems like a vague term — what 
does it really mean? This is not a trivial ques- 
tion — as with all information systems, if we 
don’t understand clearly what we are trying to 
accomplish, as well as how we will measure 
whether we’ve achieved our goals, success will 
be elusive. The overall goal may be stated as 
“comprehensive electronic patient informa- 
tion when and where needed.” This includes 
both immediate access to comprehensive 
records for individuals (for care) and the abil- 
ity to search and aggregate information across 
the population (for public health, medical 
research, quality improvement, and policy). 

We know that patients in hospitals always 
have a unified chart (be it paper or electronic) 
that contains all their hospital records from 
all sources. However, there is no equivalent 
“outpatient chart” with comprehensive 
records from all providers in a single place. 
The lack of this information is a serious prob- 
lem: a survey of doctor visits in 2015 found 
that 55% of patients reported that their medi- 
cal history was missing or incomplete, while 
49% indicated that their physician was not 
aware of which prescription medications they 
were taking.' Naturally, the result of this lack 
of information is undertreatment, overtreat- 
ment, and medical errors. 


15.2 Vision & Benefits of HIl 


The vision of HII is comprehensive electronic 
patient information when and where needed, 
allowing providers to have complete and cur- 
rent information upon which to base clinical 
decisions. In addition, clinical decision sup- 


1 Surescripts Survey Finds Patients Prefer Digitally 
Savvy Doctors and Demand a Connected 
Healthcare Experience. Released 28 Sept 2015. 
Retrieval 28 Aug 2018: > https://surescripts.com/ 
news-center/press-releases/!content/ 
surescripts-survey-finds-patients-prefer-digitally- 
savvy-doctors-and-demand-a-connected-health- 
care-experience 
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port (see > Chap. 22) would be integrated 
with information delivery. In this way, both 
clinicians and patients could receive remind- 
ers of the most recent clinical guidelines and 
research results. This would avoid the need for 
clinicians to have superhuman memory capa- 
bilities to assure the effective practice of med- 
icine, and enable patients more easily to 
adhere to complex treatment protocols and to 
be better informed. Patients could also review 
and add information to their record and 
thereby become more active participants in 
their care. In addition, the availability of com- 
prehensive records for each patient would 
enable value-added services, such as immedi- 
ate electronic notifications to patients’ family 
members about emergency care, as well as 
authorized queries in support of medical 
research, public health, and public policy 
decisions. 


15.2.1 Value Versus Completeness 


of Information 


In considering HII, it is extremely important 
to appreciate that medical information for a 
given patient must, in general, be relatively 
complete before it is truly valuable for clinical 
use (see © Fig. 15.1). For example, if a physi- 
cian had access to an electronic information 
system that could retrieve half of each 
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patient’s list of medications, it is unlikely such 
a system would be actively used. Knowing 
that the information was incomplete, the phy- 
sician would still need to rely on other tradi- 
tional sources of information to fill in the 
missing data (including questioning the 
patient). So there would be little added benefit 
for investing the time to obtain the partial 
information from the new system. Similarly, 
applying clinical decision support to incom- 
plete patient data may produce erroneous, 
misleading, or even potentially dangerous 
results. Therefore, HII systems must reliably 
provide reasonably complete information to 
be valuable to clinicians for patient care, and 
therefore to make their use worthwhile. 

Besides their limited value, incomplete 
records are potentially dangerous. The con- 
clusions that providers draw from a partial 
picture of a patient’s history may often prove 
to be incorrect. For example, missing contra- 
indications could result in the prescription of 
a medication with serious adverse effects. 
Incomplete records also are a source of unnec- 
essary costs. When test and/or procedure 
results are not available, but are needed for 
care, they are likely to be repeated. 

Because of the above factors, the cost of 
obtaining incomplete records is not typically 
accompanied by substantial benefits. As a 
result, organizations compiling such records 
find themselves under great financial stress. 
Many such “health information exchange” 


value vs. completeness of 


health information. 
Medical information of 80 


any given type for a 


patient typically needs to 
be over 85% complete 
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(HIE) organizations have failed in the past 
few years, e.g., Washington (DC), Kansas, 
Tennessee, CalRHIO, and CareSpark 
(Kingsport, TN). 

Importantly, today no patients in the U.S. 
can be assured that wherever they seek care, 
their comprehensive records from all sources 
will be available to their provider. 

The U.S. Congress has recognized the 
importance of comprehensive records and has 
mandated that the problem be solved in the 
21st Century Cures Act, enacted in late 2016: 
= “The Secretary shall use existing authorities 

to encourage partnerships ... with the goal 

of offering patients access to their electronic 
health information in_a single, longitudinal 
format that is easy to understand, secure, 
and may be updated automatically.” |... 
and ...] 

= “... promote policies that ensure that a 
patient’s electronic health information is 
accessible to that patient and the patient's 
designees, in a manner that facilitates 
communication with the patients health 
care providers and other individuals, 
including researchers, consistent with such 
patient’s consent.”” 


The Final Rule implementing the 21st Century 
Cures Act, reinforces these goals by requiring 
application programming interfaces (APIs) 
that allow all data elements of a patient’s elec- 
tronic health record to be accessed, exchanged, 
and used without special effort and a more 
granular approach to consent management. It 
also requires that patients be allowed to access 
all of their electronic health record informa- 
tion at no cost and prohibits “information 


2 21st Century Cures Act, H.R.34 — 114th Congress 
(2015-2016), Section 4006. Retrieval 29 Oct 2018: 
> https://www.congress.gov/bill/114th-congress/ 
house-bill/34 

3 21st Century Cures Act: Interoperability, 
Information Blocking and the ONC Health IT 
Certification Program. Released May 1, 2020. 
Retrieval 23 May 2020: $ https://www. 
federalregister.gov/documents/2020/05/01/2020- 
07419/21st-century-cures-act-interoperability-infor- 
mation-blocking-and-the-onc-health-it-certification 


blocking” in response to legitimate requests 
for patient records. 


15.2.2 Value in Patient Care 


The potential benefits of HII are both numer- 
ous and substantial. Perhaps most important 
are error reduction and improved quality of 
care. Many studies have shown that the com- 
plexity of present-day medical care results in 
very frequent errors of both omission and 
commission (IOM 1999). The source of this 
problem was clearly articulated by Masys, 
who observed that current medical practice 
depends upon the “clinical decision-making 
capacity and reliability of autonomous prac- 
titioners for classes of problems that rou- 
tinely exceed the bounds of unaided human 
cognition” (Masys 2002). Electronic health 
information systems can contribute signifi- 
cantly to alleviating this problem by remind- 
ing practitioners about recommended actions 
at the point of care. This can include both 
notifications of actions that may have been 
missed and warnings about planned treat- 
ments or procedures that may be harmful or 
unnecessary. Literally dozens of research 
studies have shown that such reminders 
improve safety and reduce costs (Bates 2000). 
In one such study, medication errors were 
reduced by 55% (Bates et al. 1998). Another 
study by the Rand Corporation showed that 
only 55% of U.S. adults were receiving rec- 
ommended care (McGlynn et al. 2003). The 
same techniques used to reduce medical 
errors with electronic health information sys- 
tems also contribute substantially to ensuring 
that recommended care is provided. This is 
becoming increasingly important as the pop- 
ulation ages and the prevalence of chronic 
disease increases. 

Guidelines and reminders also can improve 
the effectiveness of dissemination of new 
research results. Widespread dissemination of 
new research to the clinical setting is very 
slow; one study showed an average of 17 years 
(Balas and Boren 2000). Patient-specific 
reminders delivered at the point of care, high- 
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lighting important new research results, could 
substantially accelerate this adoption rate. 

Another important contribution of HII to 
the research domain is improving the effi- 
ciency of clinical trials. At present, most such 
trials require the creation of a unique infor- 
mation infrastructure to ensure protocol com- 
pliance and to collect essential research data. 
With an effective HII, every practitioner 
would have access to a fully functional and 
comprehensive electronic health record (EHR) 
for each patient, so clinical trials could rou- 
tinely be implemented through the dissemina- 
tion of guidelines that specify the research 
protocol. Data collection could occur auto- 
matically in the course of administering the 
protocol, reducing time and costs. In addi- 
tion, there would be substantial value in ana- 
lyzing de-identified aggregate data from 
routine patient care to assess the outcomes of 
various treatments and monitor the health of 
the population. 

Another critical function for HII is early 
detection of patterns of disease, particularly 
early detection of outbreaks from newly- 
virulent microorganisms or possible bioter- 
rorism. Our current system of disease 
surveillance, which primarily depends on alert 
clinicians diagnosing and reporting unusual 
conditions, is both slow and potentially unre- 
liable. These problems are illustrated by 
delayed detection of the anthrax attacks in 
the Fall of 2001, when seven cases of cutane- 
ous anthrax in the New York City area 2 
weeks before the so-called “index” case in 
Florida went unreported (Lipton and Johnson 
2001). Since all the patients were seen by dif- 
ferent clinicians, the pattern could not have 
been evident to any of them even if the correct 
diagnosis had immediately been made in every 
case. Wagner et al. described nine categories 
of requirements for surveillance systems for 
potential bioterrorism outbreaks—several 
categories must have immediate electronic 
reporting to ensure early detection (Wagner 
et al. 2003). 

HII would allow immediate electronic 
reporting of both relevant clinical events and 
laboratory results to public health (see 
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> Chap. 18). Not only would this be an 
invaluable aid in early detection of bioter- 
rorism, it would also serve to improve the 
detection of the much more common natu- 
rally occurring disease outbreaks. In fact, 
early results from a number of electronic 
reporting demonstration projects show that 
disease outbreaks can routinely be detected 
sooner than was ever possible using the cur- 
rent system (Overhage et al. 2001). While 
early detection has been shown to be a key 
factor in reducing morbidity and mortality 
from bioterrorism (Kaufmann et al. 1997), it 
will also be extremely helpful in reducing the 
negative consequences from other disease 
outbreaks. 

Although the U.S. Congress mandated the 
creation of a national public health situational 
awareness network in the Pandemic and All- 
Hazards Preparedness Acts of both 2006 and 
2013, the General Accounting Office has doc- 
umented that such a system has yet to be 
deployed in two separate reports*° and has 
issued letters in both 2019 and 2020° high- 
lighting the failure of DHHS to implement 
the required capabilities. HII can in fact func- 
tion as an effective public health situational 
awareness network by reporting and/or pro- 
viding access to relevant disease events in near 
real time to public health authorities without 
the necessity of creating a separate, duplica- 
tive, and expensive system solely for public 
health (Sittig and Singh 2020). For example, 


4 General Accounting Office Report 11-99 (2010) 
Public Health information Technology: Additional 
Strategic Planning Needed to Guide HHS’s Efforts 
to Establish Situational Awareness Capabilities. 
Retrieval 23 May 2020: > https://www.gao.gov/ 
products/GAO-11-99 

5 General Accounting Office Report 17-377 (2017) 
Public Health Information Technology: HHS Has 
Made Little Progress toward Implementing 
Enhanced Situational Awareness Network 
Capabilities. Retrieval 23 May 2020: $ https:// 
www.gao.gov/products/GAO-17-377 

6 General Accounting Office (2020) Priority Open 
Recommendations: Department of Health and 
Human Services (April 23, 2020). Retrieval 23 May 
2020: > https://www.gao.gov/assets/710/706568. 
pdf 
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imagine how valuable it would be in tracking 
the Covid-19 pandemic if timely information 
on patients reporting relevant symptoms to 
their physicians were available nationwide on 
a daily basis. 

Finally, HII can substantially reduce 
healthcare costs. The inefficiencies and dupli- 
cation in our present paper-based healthcare 
system are enormous. One study showed that 
the anticipated nationwide savings from 
implementing advanced computerized physi- 
cian order entry (CPOE) systems in the out- 
patient environment would be $44 billion per 
year (Johnston et al. 2003), while a related 
study (Walker et al. 2004) estimated $78 bil- 
lion more in savings from health information 
exchange (HIE) (for a total of $112 billion 
per year). Substantial additional savings are 
possible in the inpatient setting—numerous 
hospitals have reported large net savings 
from implementation of EHRs. Another 
analysis concluded that the total efficiency 
and patient safety savings from HII would be 
in range of $142-371 billion each year 
(Hillestad et al. 2005), and a survey of the 
recent literature found predominantly posi- 
tive benefits from HII (Menachemi et al. 
2018). It is important to note that much of 
the savings depends not just on the wide- 
spread implementation of EHRs, but the 
effective interchange of this information to 
ensure that the complete medical record for 
every patient is immediately available in 
every care setting. 

Inasmuch as the current cost trend of 
healthcare is unsustainable, particularly in the 
face of our aging population, this issue is both 
important and urgent. Without comprehen- 
sive electronic patient information, any 
healthcare reform is largely guesswork in our 
current “black box” healthcare environment 
where the results of interventions often take 
years to understand. We do not currently have 
mechanisms for timely monitoring of health- 
care outcomes to inform needed course cor- 
rections in any proposed reform. In essence, 
healthcare must be “informed” before it can 
be effectively “reformed.” 


15.3 History 


In the U.S., the first major report to address 
HII was issued in 1991 by the Institute of 
Medicine (now known, since 2015, as the 
National Academy of Medicine, a part of the 
National Academies of Sciences, Engineering, 
and Medicine or NASEM). This report, “The 
Computer-Based Patient Record” (IOM 
1991), was the first in a series of national 
expert panel reports recommending transfor- 
mation of the healthcare system from reliance 
on paper to electronic information manage- 
ment (see > Chap. 14). In response to the 
IOM report, the Computer-based Patient 
Record Institute (CPRI), a private not-for- 
profit corporation, was formed for the pur- 
pose of facilitating the transition to 
computer-based records. A number of com- 
munity health information networks (CHINs) 
were established around the country in an 
effort to coalesce the multiple community 
stakeholders in common efforts towards elec- 
tronic information exchange. The Institute of 
Medicine updated its original report in 1997 
(IOM 1997), again emphasizing the urgency 
to apply information technology to the infor- 
mation intensive field of health care. 

However, most of the CHINs were not 
successful. Perhaps the primary reason for 
this was that the standards and technology 
were not yet ready for cost-effective 
community-based electronic HIE. Another 
problem was the focus on availability of 
aggregated health information for secondary 
uses (e.g., policy development), rather than 
individual information for the direct provision 
of patient care. Also, there was neither a sense 
of extreme urgency nor were there substantial 
funds available to pursue these endeavors. 
However, at least one community 
(Indianapolis, Indiana) continued to move 
forward throughout this period and has now 
emerged as a national example of the applica- 
tion of information technology to health care 
both in individual healthcare settings and 
throughout the community (McDonald et al. 
2005). 
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Widespread attention was focused on this 
issue with the IOM report “To Err is Human” 
(IOM 1999). This landmark study docu- 
mented the accumulated evidence of the high 
error rate in the medical care system, includ- 
ing an estimated 44,000-98,000 preventable 
deaths each year in hospitals alone. It has 
proven to be a milestone in terms of public 
awareness of the negative consequences of 
paper-based information management in 
healthcare. Along with the follow-up report, 
“Crossing the Quality Chasm” (IOM 2001), 
the systematic inability of the healthcare sys- 
tem to operate at a high degree of reliability 
has been thoroughly elucidated. A more 
recent analysis estimated a much larger num- 
ber of preventable deaths due to medical 
errors — over 400,000 (Makary and Daniel 
2016). These reports clearly place the blame 
on the system, not on the dedicated healthcare 
professionals who work in an environment 
without effective tools to promote quality and 
to minimize errors. 

Several additional national expert panel 
reports have emphasized the IOM findings. 
In 2001, the Presidents Information 
Technology Advisory Committee (PITAC) 
issued a report entitled “Transforming Health 
Care Through Information Technology” 
(PITAC 2001). That same year, the Computer 
Science and Telecommunications Board of 
the National Research Council (NRC) 
released “Networking Health: Prescriptions 
for the Internet” (NRC 2001), which empha- 
sized the potential for using the Internet to 
improve electronic exchange of healthcare 
information. That same year, the National 
Committee on Vital and Health Statistics 
(NCVHS) outlined a vision for building a 
National HII in its report, “Information for 
Health” (NCVHS 2001). NCVHS, a statu- 
tory advisory body to the U.S. Department 
of Health and Human Services (DHHS), 
indicated that Federal government leadership 
was needed to facilitate further development 
of HII. In response, DHHS began an HII ini- 
tiative, organizing a large national conference 
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in 2003 to develop a consensus agenda to 
guide progress.’ (Yasnoff et al. 2004) 

In April, 2004, a Presidential Executive 
Order created the Office of the National 
Coordinator for Health Information 
Technology (ONC) in DHHS (see also 
> Chap. 29). The initial efforts of ONC 
focused on promoting standards and certifica- 
tion to support adoption of EHRs by physi- 
cians and hospitals. It also promoted 
implementation of an “institution centric” 
model for HIE by Regional Health Information 
Organizations (RHIOs), wherein electronic 
records for a given patient stored at sites of 
past care episodes are located, assembled, and 
delivered in real time when needed for patient 
care. Four demonstration projects implement- 
ing this model were funded, but did not lead 
to sustainable systems. 

In 2008, ONC was codified in law by the 
Health Information Technology for Economic 
and Clinical Health (HITECH) portion of the 
ARRA statute (> Chap. 29). In addition, 
$20+ billion was appropriated including $2 
billion for ONC and the remainder for pay- 
ment of EHR incentives through Medicare 
and Medicaid to providers who achieved 
“Meaningful Use” of these systems. The ONC 
used its resources to establish regional exten- 
sion centers (RECs) to subsidize assistance to 
providers adopting and using EHRs ($677 
million), fund states to establish HIEs ($564 
million) and initiate several research pro- 
grams. 

In December, 2010, the President’s Council 
of Advisors on Science and Technology 
(PCAST) issued a report expressing concern 
about ONC strategy, specifically indicating 
that its HIE efforts through the states “will not 
solve the fundamental need for data to be uni- 
versally accessed, integrated, and understood 
while also being protected” (PCAST 2010). 


7 Department of Health and Human Services. (2003) 
The National Health Information Infrastructure. 
Retrieval 29 Oct 2018: $ http://aspe.hhs.gov/sp/ 
nhii/ 
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Findings of a 2011 survey of HIEs “call into 
question whether RHIOs in their current form 
can be self-sustaining and effective” (Adler- 
Milstein et al. 2011). 

Over the past few years, these concerns 
have been proven correct; HIE efforts have 
largely floundered. In the 2013 “reboot” 
report, a group of U.S. Senators criticized 
lack of a clear path to interoperability and 
sustainability, inadequate privacy and secu- 
rity protections, and failure to achieve health 
care cost reductions.® That same year, DHHS 
itself admitted that “[current policy] alone 
will not be enough to achieve the widespread 
interoperability and electronic exchange of 
information necessary for delivery reform 
where information will routinely follow the 
patient regardless of where they receive care” 
in an RFI requesting ideas for “accelerating 
HIE progress.”? At least one experienced for- 
mer DHHS official bemoaned the lack of an 
HII architecture to guide efforts.!° Even the 
mainstream media noticed the continuing 
lack of progress (Creswell 2014). 

A systematic review found that HIE use 
through 2015 in the U.S. was “growing but 
still limited” (Devine et al. 2017). The official 
evaluation of ONC’s state HIE program tried 
to put a positive spin on these activities with 
the statistically meaningless conclusion that 
“about half the states [were] performing at or 
above the national average,” a result that by 
definition would always be the case. The 
report also observed that “sustainability was a 


8 Reboot: Re-examining the strategies needed to 
successfully adopt health IT. Retrieval 29 Oct 2018: 
> https://www.thune.senate. gov/public/_cache/ 
files/Ocf0490e-76af-4934-b534-83f5613c7370/ 
C60F25439BE1 CEC36DF9E3834942D908. 
ehr-white-paper-april-15.pdf 

9 CMS. Advancing Interoperability and Health 
Information Exchange (Request for Information). 
Retrieval 29 Oct 2018: $ https://www. 
federalregister.gov/docu- 
ments/2013/03/07/2013-05266/ 
advancing-interoperability-and-health-information- 
exchange 

10 Loonsk JW. Where’s the plan for interoperability? 
Healthcare IT News (22 Sept 2014). Retrieval 29 
Oct 2018: > https://www.healthcareitnews.com/ 
news/wheres-plan-interoperability 


persistent concern among grantees.”!! The 
latter challenge has since resulted in the shut- 
down of multiple projects. 

Although those few surviving HIEs have 
nearly all abandoned their original institution- 
centric architectures in favor of patient-centric 
repositories (more on HII architecture in 
> Sect. 15.5 below), the predicted reductions 
in health care costs from the widespread use 
of EHRs have not materialized primarily due 
to the incomplete nature of patient records 
available through current HIE systems. A 
2017 study observed “To date, only a small 
number of HIE studies have demonstrated 
benefits to patients, providers, public health, 
or payers” (Yeager et al. 2017). 

Overall, it is clear that three decades after 
the 1991 IOM report urging universal adop- 
tion of EHRs, the U.S. still lacks a clear and 
feasible roadmap leading to the ultimate goal 
of widespread availability of comprehensive 
electronic patient information when and 
where needed. Despite much progress, no one 
in the U.S. as yet receives their medical care 
with the assured, immediate availability of all 
their records across multiple providers and 
provider organizations. 


15.4 Requirements for HII 


As with any informatics system development 
project, it is critical at the outset to understand 
the desired end result. In the case of a large, 
extremely complex system such as HII, this is 
especially important because there are many 
stakeholders with conflicting incentives and 
agendas, as well as challenging policy and 
operational issues. The ultimate goal is the 
“universal availability of comprehensive elec- 
tronic patient records when and where needed.” 
In transforming this goal into a design specifi- 
cation, it is critical to understand the issues 


11 Dullabh PM, Parashuram S, Hovey L, Ubri P, 
Fischer K. (2016) Evaluation of the State HIE 
Cooperative Agreement Program: Final Report. 
Retrieval 29 Oct 2018: $ https://www.healthit.gov/ 
sites/default/files/reports/ 
finalsummativereportmarch_2016.pdf 
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and constraints that must be addressed. Then 
any proposed system design must first demon- 
strate on paper how the objectives will be 
achieved within those limitations. 


15.4.1 Privacy and Trust 


The most important and overriding require- 
ment of HII is privacy. Clearly, health records 
are very sensitive — perhaps the most sensitive 
personal information that exists. In addition 
to our natural desire to keep our medical 
information private, improper disclosure can 
lead to employment and other types of dis- 
crimination. Furthermore, failure to assure 
the privacy of records will naturally result in 
patients being unwilling to disclose important 
personal details to their providers — or even to 
avoid seeking care at all. In addition to the 
contents of the records, the very existence of 
certain records (e.g., a visit to psychiatric hos- 
pital) is sensitive even if no details are avail- 
able. Therefore, extraordinary care must be 
taken to ensure that information is protected 
from unauthorized disclosure and use. 

In general, U.S. Federal law (the HIPAA 
Privacy Rule as introduced in » Chap. 12) 
requires patient consent for disclosure and use 
of medical records. However, consent is not 
required for record release for treatment, pay- 
ment, and healthcare operations. These “TPO” 
exceptions have, as a practical matter, allowed 
healthcare organizations to utilize medical 
records extensively while bypassing patient 
consent. The organization that holds medical 
information has sole discretion to make the 
decision whether a proposed disclosure is or is 
not a TPO exception. Until recently, TPO dis- 
closures did not even need to be recorded, 
effectively preventing discovery of improper 
disclosures. Even under the HITECH legisla- 
tion that requires records of TPO disclosures, 
such records are not automatically available to 
the subjects of the disclosures. The net effect is 
that individuals not only lack control over the 
dissemination of their medical records, but are 
not even informed when they are disclosed 
beyond where they were created. 

It seems appropriate to question whether 
this disclosure regime is adequate for elec- 
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tronic health records. The general public 
understands that making electronic patient 
records available for good and laudable pur- 
poses simultaneously makes them more avail- 
able for evil and nefarious purposes, thereby 
necessitating higher levels of protection to 
avoid abuses. Assigning decision-making for 
disclosure of personal medical records to any- 
one other than the patient or the patient’s rep- 
resentative inherently erodes trust. In essence, 
the patient is being told, “we are going to 
decide for you where your medical records 
should go because we know what’s in your 
interest better than you do.” A patient may 
wonder why, if a given disclosure is in their 
interest, their consent would not be sought. 
Furthermore, failing to seek such consent 
inevitably leads to suspicion that the disclo- 
sure is in fact not in the patient’s interest, but 
rather in the interest of the organization 
deciding that the records should be released. 
The concern about privacy of medical 
records is not at all theoretical or insignifi- 
cant. In at least two consumer surveys, 
13-17% of consumers indicated that they 
already employ “information hiding” behav- 
iors with respect to their medical records!” 
(Shortliffe 2011). This includes activities such 
as obtaining laboratory tests under an 
assumed name or seeking out-of-state treat- 
ment to conceal an illness from their primary 
care provider. Even assuming that everyone 
engaged in such behaviors was willing to 
admit to them in such a survey, this represents 
a substantial proportion of consumers who 
would, at a minimum, refuse to participate in 
an electronic medical information system that 
did not provide them with control over their 
own records. Of even greater concern, such a 
large percentage of consumers would likely 
organize and use their political power to halt 
the deployment and operation of such a sys- 
tem. Indeed, it was a much smaller percentage 
of concerned citizens that, citing the threat to 
privacy, convinced Congress to repeal the pro- 


12 California Health Care Foundation. (2005) 
National Consumer Health Privacy Survey. 
Retrieval 29 Oct 2018: $ http://www.chcf.org/ 
publications/2005/1 1/national-consumer-health- 
privacy-survey-2005 
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visions in the original HIPAA legislation call- 
ing for a unique medical identifier for all U.S. 
residents (see > Chap. 12). 

In view of this, there are those who argue 
that all decisions about release of patient 
records need to be entrusted to the patient 
(with rare exceptions, such as mental incom- 
petence). They also suggest that attention to 
these concerns may be especially important 
for enabling HI, because patients must trust 
that their records are not being misused in 
such a system. Some argue that patients are 
not sufficiently informed to make such deci- 
sions and may make mistakes that are harm- 
ful to them, whereas others believe that the 
negative consequences of delegating this 
decision-making to others than the patient 
could be much greater. Advocates of patient 
control of medical information argue, by 
analogy, that society has accepted that 
individuals retain the right to make decisions 
about how their own money is spent, even 
though this can lead to adverse consequences 
when those decisions prove to be unwise. In 
considering these issues, it should be noted 
that prior to the 2002 HIPAA Privacy Rule 
that established the TPO exceptions, both law 
and practice had always required patient con- 
sent for all access to medical records. While 
acknowledging the need for consumer educa- 
tion about decisions relating to release of 
medical records, patient-control advocates 
believe that the same freedom and personal 
responsibility that applies to an individual’s 
financial decisions should be applied to the 
medical records domain. These medical infor- 
mation privacy policy issues may be even 
more urgent in the context of the enhanced 
trust necessary when seeking to implement an 
effective and widely accepted HII. 


15.4.2 Stakeholder Cooperation 


To ensure the availability of comprehensive 
patient records, all healthcare stakeholders 
that generate such records must consistently 
make them available. While it would be ideal if 
such cooperation were voluntary and univer- 
sal, assuring long-term collaboration of com- 
peting healthcare stakeholders is problematic. 


Indeed, only a handful of communities have 
succeeded in developing and maintaining an 
organization that includes the active participa- 
tion of the majority of healthcare providers. 
Even in these communities, the system could 
be disrupted at any time by the arbitrary with- 
drawal of one or more participants. The unfor- 
tunate reality is that healthcare stakeholders 
are often quite reluctant to share patient 
records, fearing loss of competitive advantage. 
Therefore, some would argue that mandat- 
ing healthcare stakeholder participation in a 
system for sharing electronic patient records is 
highly desirable, since it would result in con- 
sistently more comprehensive individual 
records. Since imposing a new requirement on 
healthcare stakeholders would be a daunting 
political challenge, such an approach would 
be most easily accomplished as part of an 
existing mandate. Proponents of this approach 
have noted that one such mandate that could 
be utilized is the HIPAA Privacy Rule itself, 
which requires all providers to respond to 
patient requests for their own records (U.S. 45 
CFR 164.524(a)). Furthermore, if patients 
request their records in electronic form, and 
they are available in electronic form, this regu- 
lation also requires that they be delivered in 
electronic form. Although not well known, 
this latter provision is included in the original 
HIPAA Privacy Rule (U.S. 45 CFR 164.524(c) 
(2)), and has been reinforced by HITECH. It 
is also being promoted by the “blue button” 
initiative that seeks to allow patients to 
retrieve their own records electronically! and 
the growing movement by patients advocating 
guaranteed access to their own data.!4 
Advocates argue that patient control, in 
addition to being an effective approach to pri- 


13 Chopra A, Park T, Levin PL. (2010) ‘Blue Button’ 
Provides Access to Downloadable Personal Health 
Data. Retrieval 29 Oct 2018: $ http://www. 
whitehouse. gov/blog/2010/10/07/ 
blue-button-provides-access-downloadable-per- 
sonal-health-data 

14 Miliard M. (2014) Patients want online access to 
records. Healthcare IT News (5 May 2014). 
Retrieval 29 Oct 2018: $ https://www. 
healthcareitnews.com/news/ 
patients-want-online-access-records 
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vacy, could also serve to ensure ongoing, con- 
sistent healthcare stakeholder participation. 
Of course, in order for this approach to be 
practical, the rights of patients to electronic 
copies of their records under HIPAA would 
need to be universally enforced. Such enforce- 
ment has to date been inconsistent, and, until 
recently, exclusively dependent on the Office 
of Civil Rights at DHHS (since patients do 
not have a private right of action). Under 
HITECH, state attorneys general may also 
bring legal action, which provides another 
legal avenue for improving compliance. 
Another option to ensure that providers 
make their electronic records available to 
patients for use in compiling comprehensive 
records would be to create a linkage to reim- 
bursement for care. In this scenario, each pro- 
vider would be required to offer to deposit the 
new information generated from an encounter 
in a place of the patient’s choice. In cases 
where the patients designated a destination 
for their information, payment for the care 
received would be contingent on the deposit 
of the required data. While this may seem 
somewhat coercive and even radical, it is con- 
sistent with practices in other service indus- 
tries. Whether it be car repair, plumbing, or 
legal services, payment is nearly always con- 
tingent on the client receiving detailed justifi- 
cations and descriptions of the services 
provided. While a growing number of health 
care providers are increasingly providing such 
“visit summaries,” at least in paper form, this 
remains the exception in the medical domain. 


15.4.3 Ensuring Information 
in Standard Electronic Form 


It is self-evident that the electronic exchange 
of health information cannot occur if the 
information itself is not in electronic form. 
Over the past decade, nearly all U.S. hospitals 
and most U.S. physicians have adopted EHRs 
as a result of the Federal government’s 
“Meaningful Use” financial incentives. 
However, the major obstacle for physician 
adoption of EHRs has not been merely cost, 
as is often cited, but the very unfavorable 
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ongoing cost/benefit ratio. Most of the bene- 
fits of EHRs in physician offices accrue not to 
the physician, but to other stakeholders. In 
one study, 89% of the economic benefit was 
attributed to other stakeholders (Hersh 2004). 
It is unreasonable to expect physicians to 
shoulder 100% of the cost of systems while 
accruing only 11% of the benefits. Even as 
physician EHR adoption levels have increased, 
there have been increasing complaints about 
the burden that EHRs impose on physicians 
(Sinsky et al. 2016). 

It is important to note that EHRs alone, 
even if adopted by all healthcare providers, 
are a necessary but not sufficient condition for 
achieving HII. Indeed, each EHR simply con- 
verts an existing paper “silo” of information 
to electronic form. These provider-based sys- 
tems manage the provider information on the 
patient in question, but do not have all the 
information for each patient. To achieve the 
goal of availability of comprehensive patient 
information, there must also be an efficient 
and cost-effective mechanism to aggregate the 
scattered records of each patient from all their 
various providers. Such aggregation also 
requires effective standards for encoding and 
communicating EHR information. Major 
gains in quality and efficiency of care will be 
attainable only through HII that ensures the 
availability of every patient’s comprehensive 
record when and where needed. 


15.4.4 Financial Sustainability 


There are three fundamental approaches that 
can be used individually or in combination to 
provide long-term financial sustainability for 
HII: (1) public subsidy; (2) leveraging antici- 
pated future healthcare cost savings; and/or 
(3) leveraging new value created. The first 
approach has been advocated by those who 
assert, with some justification, that HII repre- 
sents a public good that benefits everyone. 
They compare HI to other publicly available 
infrastructure, such as roads, and suggest that 
taxation is an appropriate funding mecha- 
nism. Of course, new taxes are consistently 
unpopular and politically undesirable, and 
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other key infrastructures such as public utili- 
ties and the Internet, although regulated, are 
funded through user fees rather than taxation. 
Note, however, that at least two U.S. states 
(Maryland and Vermont) are using this mech- 
anism to help fund their HII. 

The most common approach suggested for 
long-term HII sustainability is leveraging 
anticipated healthcare cost savings. This is 
based on the substantial and growing body of 
evidence that the availability of more compre- 
hensive electronic patient records to providers 
results in higher quality and lower cost care 
(AHRQ 2006; Menachemi et al. 2018). Some 
of the best examples include large, mostly 
closed healthcare systems such as Kaiser, 
Group Health and the Veterans Administra- 
tion, where the availability of more complete 
patient records in electronic form over time 
has been consistently associated with both cost 
savings and better care. While the case for HII 
reducing healthcare costs is compelling, the 
distribution and timing of those savings is dif- 
ficult to predict. In addition, cost savings to 
the healthcare system means revenue losses to 
one or more stakeholders — clearly an undesir- 
able result from their perspective. Finally, the 
allocation of savings for a given population of 
patients is unknown, with the result that orga- 
nizations are reluctant to make specific finan- 
cial commitments that could be larger than 
their own expected benefits. 

The final but least frequently mentioned 
path to financial sustainability of HII is uti- 
lizing the new value created by the availabil- 
ity of comprehensive electronic information. 
While it is widely recognized that this infor- 
mation will be extremely valuable for a wide 
variety of purposes, this option has 
remained largely unexplored. One example 
of such new value is the potential reduction 
in cost for delivering laboratory results to 
ordering physicians. The expenses borne by 
individual laboratories for their own infra- 
structure providing this essential service can 
be greatly reduced by a single uniform com- 
munity infrastructure providing electronic 
delivery to physicians through one mecha- 
nism. Another example is availability of 


medical information for research — both to 
find eligible subjects for clinical trials and to 
utilize the data itself for research queries. 
While this latter application has the poten- 
tial to defray a substantial portion of the 
costs of HII, it requires efficient mecha- 
nisms for both searching data and recording 
and maintaining patient consent that have 
not generally been incorporated into HII 
systems. 

Perhaps the most lucrative HII revenue 
source lies in the development of innovative 
applications that rely on the underlying infor- 
mation to deliver compelling value to con- 
sumers and other healthcare stakeholders. For 
example, HII allows the delivery of timely and 
accurate reminders and alerts to patients for 
recommended preventive services, needed 
medication refills, and other medically related 
events of immediate interest to patients and 
their families. It also would allow deployment 
of applications that assist consumers auto- 
matically with management of their chronic 
diseases. Utilizing new value to finance HII 
avoids the prediction and allocation problems 
inherent in attempts to leverage expected 
healthcare cost savings, with the added incen- 
tive that any such savings would fully accrue 
to whoever achieves them. 


15.4.5 Community Focus 


Most observers believe that successful HII 
must be focused on the community. An essen- 
tial element in HII is trust, which is inherently 
local. Furthermore, health care itself is pre- 
dominantly local, since the vast majority of 
medical care for residents of a given commu- 
nity is provided in that community. Indeed, 
people traveling away from home who are 
injured or become ill inevitably will return 
home at their earliest opportunity if their con- 
dition permits (and does not resolve quickly). 
Since medical care is predominantly local, 
creating a system that delivers comprehensive 
electronic patient information in a community 
solves the overwhelming majority of informa- 
tion needs in that community. While move- 
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ment of health information over long 
distances has some value and ultimately must 
be addressed to assure completeness of 
records, its contribution to a total solution is 
marginal. 

The lack of any examples of working HII 
in communities larger than about 10 million 
people provides additional evidence of the 
need for local focus. Keeping the scope of 
such projects relatively small also increases 
their likelihood of success by reducing com- 
plexity, thereby avoiding the huge increases in 
failure rates of extremely large-scale IT proj- 
ects. This rule of thumb is reinforced by the 
relatively small populations of countries that 
have successfully implemented effective HII 
such as Finland (5.5 million) and Estonia (1.3 
million). 

In thinking about HII, analogies are often 
made to the international financial system 
that efficiently transfers and makes funds 
available to individuals anywhere in the 
world. However, it is often forgotten that 
these financial institutions, that also are 
heavily dependent on trust, began as “build- 
ing and loan funds” in very small communi- 
ties designed to share financial resources 
among close neighbors. It took many decades 
of building trust before large-scale national 
and international financial institutions 
emerged. 


15.4.6 Governance 
and Organizational Issues 


Trust is arguably the most important element 
in considering the appropriate governance for 
HII. Even in a system where patients exert full 
control over their own records, the organiza- 
tion that operates the HII must earn the full 
faith and confidence of consumers for the 
security, integrity, and protection of the 
records, as well as ensuring that records are 
appropriately available only for purposes that 
consumers specify. Furthermore, the organi- 
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zation ideally must be devoid of any biases or 
hidden agendas that would favor one category 
of healthcare stakeholders over another, or 
favor specific stakeholders within a given cat- 
egory. 

None of the existing healthcare stakehold- 
ers seem well suited to meet the trust require- 
ment. Many argue that government cannot 
operate an HII because it is inherently not 
trusted with sensitive personal records, and 
furthermore needs to assume the role of pro- 
viding regulatory oversight for whatever orga- 
nization does take the HII responsibility. 
Similarly, it seems problematic for employers 
to be responsible for the HII since one of the 
primary concerns of consumers is to avoid dis- 
closing sensitive medical information to their 
employers. Health plans and insurers are typi- 
cally not trusted by consumers because their 
incentives are not aligned — they have a finan- 
cial incentive to deny care, which is a natural 
concern to consumers. Hospitals are in com- 
petition with each other and therefore are not 
in a good position to cooperate in a long-term 
HII effort. Physicians are the most trusted 
healthcare stakeholders, from a consumer per- 
spective, but are not organized in a way to 
facilitate the creation of HII. Furthermore, 
they are also in competition with each other 
and, most importantly, do not generally have 
the informatics capabilities necessary for such 
a complex endeavor. 

Therefore, many believe that an indepen- 
dent (perhaps entirely new) organization is 
needed to operate HII in communities. This 
organization would have no direct connec- 
tions to existing healthcare stakeholders and 
therefore would be unbiased. Its sole function 
would be to protect and make available com- 
prehensive electronic patient records on 
behalf of consumers. Such an independent 
organization would also ideally facilitate 
cooperation among all existing stakeholders, 
who would know that the HII activity was 
completely neutral and designed primarily to 
serve consumers. 
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15.5 Architecture for HII 


Institution-Centric 
Architecture 


15.5.1 


Initially, most developing HII systems chose 
an institution-centric approach to data stor- 
age, leaving patient records wherever they are 
created (@ Fig. 15.2). Although records are 
not stored centrally in this model, there is a 
need to maintain at least a central index of 
where information can be found for a particu- 
lar patient; without such an index, finding 
information about each patient would require 
queries to every possible source of medical 
information worldwide -- clearly an impracti- 
cal approach. When a given patient’s record is 
requested, the index is used to generate que- 
ries to the locations where information is 
stored. The responses to those queries are 
then aggregated (in real time) to produce the 
patient’s complete record. After the patient 
encounter, the new data is entered into the cli- 
nician’s EHR system and another pointer (to 
that system) is added to the index so it will be 
queried (in addition to all the other prior 
locations) next time that patient’s record is 
requested. 

While this architecture appeals to health- 
care stakeholders because they continue to 
“control” the records they generate, one can 
argue that it fails to meet several key require- 
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ments, does not scale effectively, and is com- 
plex and expensive to operate. The most 
critical requirement that is not addressed by 
this architecture is searching the data, e.g., to 
find all patients with a cholesterol level above 
300. To do such a search, the records of every 
patient must be assembled from their various 
locations and examined one at a time. Known 
as a sequential search, it has a very long com- 
pletion time that increases linearly with the 
size of the population. For example, in a 
modest-sized HIE with 500,000 patients, 
assuming retrieval and processing of each 
patient’s records requires just 2 seconds (a 
very low estimate), each such search would 
take at least 12 days (1 million seconds). 
Furthermore, every such search would require 
that each provider record system connected to 
the HIE retrieve and transmit all its informa- 
tion — a very substantial computing and com- 
munications burden (that also increases the 
risk of interception of information). In stan- 
dard database systems, impractical sequential 
search times are reduced by pre-indexing the 
contents of the records. However, such pre- 
indexing would in essence create a central 
repository of indices that could be used to 
reconstruct most of the original data, and 
therefore is inconsistent with this architec- 
tural approach. 

It may be argued that the searches could 
themselves be distributed to the provider sys- 
tems, and then the results aggregated into a 


O Fig. 15.2 Institution-centric HII architecture. (1) 
The clinician EHR requests prior patient records from 
the HIE; this clinician’s EHR is added to the index for 
future queries for this patient (if not already present). 
(2) Queries are sent to EHRs at all sites of prior care 
recorded in the HIE Index. (3) EHRs at each prior site 
of care return records for that patient to the HIE; the 


HIE must wait for all responses. (4) The returned 
records are assembled and sent to the clinician EHR; 
any inconsistencies or incompatibilities between records 
must be resolved in real time. (5) After the care episode, 
the new information is stored in the clinician EHR only. 
(Used with permission of Health Record Banking Alli- 
ance (HRBA)) 
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coherent result. However, this approach also 
fails because individual patient records are 
incomplete in each system. Therefore, searches 
that require multiple items of patient data 
(e.g., patients with chest pain who have taken 
a certain medication in the past year), will 
produce anomalous results unless all the 
instances of the relevant data for a given 
patient are in a single provider system (i.e., if 
one system finds a patient with chest pain, but 
without any indication of the medication of 
interest [which is in another provider’s sys- 
tem], that patient will not be reported as satis- 
fying the conditions) (Weber 2013). It is 
possible to launch multiple searches each lim- 
ited to a single criterion and then combine the 
results from each to generate a correct result. 
However, this would require multiples of the 
completion time for a single criterion (e.g., 
12 days x 2 = 24 days for the two criteria 
example), making the retrieval times and pro- 
cessing burdens even more untenable. 

In addition to the scaling issues for this 
architecture related to searching, there is also 
a problem with response time for assembling a 
patient record. When a given patient record is 
requested, the locations where the patient has 
available records are found using the central 
index. Then, a query-response cycle is required 
for each location where patient records are 
available. Following completion of the query- 
response cycles, all the information obtained 
must be integrated into a comprehensive 


O Fig. 15.3 Example of a 
Network Operations Center 
(NOC). (Image used by 
permission of NTT Ltd 
(available at >  https:// 
www.gin.ntt.net/support-center/ 
noc/)) 
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record and made available to the requestor. 
While the query-response cycles can all be 
done in parallel, the final integration of results 
must wait for the slowest response. As the 
number of connected systems increases, so 
does the probability of a slow (or absent) 
response from one of them when queried for 
patient records. In addition, more systems 
mean more processing time to integrate mul- 
tiple sources of information into one coherent 
record. Thus, the response time will become 
slower as the number of connected systems 
increases (Lapsia et al. 2012). 

The institution-centric architecture also 
introduces high levels of operational com- 
plexity. Since the completeness of retrieval of 
a given patient’s records is dependent on the 
availability of all the systems that contain 
information about that patient, ongoing real- 
time monitoring of all connected information 
sources is essential. This translates into a 
requirement for a 24 x 7 network operations 
center (NOC), that constantly monitors the 
operational status of every medical informa- 
tion system and is staffed with senior IT per- 
sonnel who can immediately troubleshoot and 
correct any problems detected (@ Fig. 15.3). 
Even with modest system failure rates (e.g., 
one per thousand), a community with thou- 
sands of EHRs will typically have a handful 
of systems that are unresponsive to queries 
for patient records and require immediate 
expert attention to restore to full operation. 
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The cost of this around-the-clock monitoring 
is very substantial, since a staff of at least five 
full-time network engineers is required to 
assure that at least one person is always avail- 
able for every shift 7 days a week. 

Adding to the cost of the NOC, every 
EHR system in an institution-centric model 
must incur additional expenses to always be 
able to respond to queries in real-time. This 
will be extremely problematic for physician 
offices, since their EHR systems will need to 
operate 24 x 7 and include additional hard- 
ware, software, and telecommunications 
capabilities to simultaneously support such 
queries while also serving its local users. 
Clearly, the transaction volumes generated 
will be substantial, since each patient’s records 
will be queried whenever they receive care at 
any location. Contrast this to a central repos- 
itory model where the information from a 
care episode is transmitted once to the reposi- 
tory and no further queries to the source sys- 
tems are ever needed. This analysis has been 
confirmed by a simulation study of the 
institution-centric architecture demonstrat- 
ing that both the transaction volume and 
probability of incomplete records (from miss- 
ing data due to a malfunctioning network 
node) increase exponentially with the average 
number of sites where each patient’s data is 
located (Lapsia et al. 2012). 


15.5.2 Patient-Centric Architecture 
(Health Record Banking) 


Health record banking is a patient-centric 
approach to developing community HII that 
both addresses the key requirements and can 
overcome the challenges that have stymied 
current efforts!’ (Yasnoff et al. 2013). A health 
record bank (HRB) is defined as “an indepen- 


15 Yasnoff WA. (2006) Health Record Banking: A 
Practical Approach to the National Health 
Information Infrastructure. Retrieval 29 Oct 2018: 
> http://nhiiadvisors.com/slides/Health%20 
Rec%20Banking.html 


dent organization that provides a secure elec- 
tronic repository for storing and maintaining 
an individual’s lifetime health and medical 
records from multiple sources and assuring 
that the individual always has complete con- 
trol over who accesses their information” 
(Detmer et al. 2008). 

Using a community HRB to provide 
patient information for medical care is 
straightforward (@ Fig. 15.4). Prior to seek- 
ing care (or at the time of care in an emer- 
gency), the patient gives permission for the 
caregiver to access his/her HRB account 
records (either all or part) through a secure 
Internet portal. The provider then accesses 
(and optionally, downloads) the records 
through a similar secure web site. When the 
care episode is completed, the caregiver then 
transmits any new information generated to 
the HRB to be added to the account-holder’s 
lifetime health record. The updated record is 
then immediately available for subsequent 
care. 


a History of HRBs 

The health record banking concept has been 
evolving for over two decades since it was ini- 
tially proposed (Szolovits et al. 1994). The 
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O Fig. 15.4 Patient-centric HII architecture. (1) The 
clinician EHR requests prior patient records from the 
HRB. (2) The prior patient records are immediately sent 
to the clinician EHR. (3) After the care episode, the new 
information is stored in the clinician EHR and sent to 
the HRB; any inconsistencies or incompatibilities with 
prior records in the HRB need to be resolved before that 
patient’s records are requested again (but not in real 
time). (Used with permission of Health Record Banking 
Alliance (HRBA)) 
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term “health information bank” was intro- 
duced in 1997 in the U.K. (Dodd 1997), and 
was subsequently described as the “bank of 
health” (Ramsaroop and Ball 2000). A legal 
analysis of the implications of a “health 
record trust” was published in 2002 (Kostyack 
2002), an Italian system known as the “health 
current account” was described in 2004 
(Saccavini and Greco 2004), and the “health 
record bank” concept was described by Dyson 
in 2005 (Dyson 2005). In 2006, a Heritage 
Foundation policy paper endorsed health 
record banking,'® additional papers described 
HRBs in more detail (Ball and Gold 2006; 
Shabo 2006), the non-profit Health Record 
Banking Alliance was formed,!’ the State of 
Washington endorsed the concept after a 
16-month study,'® and the non-profit Dossia 
consortium was formed by several large 
employers to implement and operate an HRB 
for their employees. In 2007, the Information 
Technology and Innovation Foundation rec- 
ommended that the health record banking 
approach be used to build the U.S. HII,!° 
while Gold and Ball described the “health 
record banking imperative” (Gold and Ball 
2007). That same year, both Microsoft and 
Google introduced patient-controlled medical 
record repositories. In 2009, three pilot HRBs 
were funded by the State of Washington and 
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the role of HRBs in protecting privacy was 
described (Kendall 2009). The HRB concept, 
although not always named as such, started 
appearing with greater frequency in articles 
discussing the need for comprehensive EHRs 
(Steinbrook 2008; Mandl and Kohane 2008; 
Kidd 2008; Miller et al. 2009; Krist and Woolf 
2011). 

More recently, discussion and activities 
related to HRBs have accelerated and 
expanded even more. Multiple articles have 
been published advocating for patient control 
of their own records and considering the 
issues involved” (Kish and Topol 2015; Haun 
and Topol 2017). An entire supplement to the 
Journal of General Internal Medicne was 
devoted exclusively to this topic (JGIM 2015). 
Even a White House Senior Advisor and the 
Administrator of CMS have joined the chorus 
touting the advantages of patient control of 
their own records.”! In addition, at least two 
publications have discussed using the value of 
the information to facilitate sustainability” 
(Porter 2018). 

There are also a continuing stream of arti- 
cles describing HRBs and advocating for their 
establishment and use. These include the 
rationale for HRBs (Yasnoff et al. 2013), 
potential HRB use in public health (Yasnoff 
et al. 2014), and a description of lessons 
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learned from an early HRB startup (Yasnoff 
and Shortliffe 2014). Endorsement of the 
HRB concept (albeit referenced under various 
different terms) has come from a variety of 
sources including a prominent observer of the 
digital transformation of health care (Mikk 
et al. 2017), a policy think tank,” a former 
ONC Director,* and a leading market 
research firm.”” Even the current CMS 
Administrator has been openly advocating for 
lifetime, longitudinal patient records accessi- 
ble to and controlled by patients.® 

Several countries have established suc- 
cessful HR Bs, including Finland,” Estonia,”® 
and Brazil.” In addition, a number of startup 
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companies such as Project Hugo, 


CareDox,*! Patients Know Best,” 
Betterpath,** and Ciitizen* (as well as a num- 
ber of others) are pursuing the challenge of 
developing HRBs. Even the tech giant Apple 
appears to be moving towards HRB imple- 
mentation with its Apple Health Kit serving 
as the infrastructure for a patient record 
repository controlled by individuals on their 
own smartphones.’ 


a How Requirements Lead to HRB 
Architecture 

O Figure 15.5 shows how the HII require- 
ments for complete records, low cost, and 
high benefits lead directly to the HRB archi- 
tecture. In order to ensure complete records, 
information from each patient encounter 
must be available. The only currently available 
mandate for this is for the patient to request 
records (in digital form) from their provider, 
invoking their rights under the HIPAA pri- 
vacy rule. In order for patients to feel comfort- 
able doing that, they must trust the system. 
The first element of trust is the architectural 
end point of security; the patients must know 
that the records being sent from their provider 
will be protected from improper use. This also 
leads to the architectural end point of patient 
control of all access to the records; how can it 
be justified to patients for some other entity to 
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O Fig. 15.5 How 
requirements lead to HRB 
architecture. This diagram 
shows how the key 
requirements of complete 
records, low cost, and high 
benefits lead directly to the 
architectural specifications 
of patient access control, a 
repository for data storage, 
and trustworthy security 
that are characteristic of 
health record banks 


Requirement: 
Low Cost 


decide which records to release to whom and 
when? In order to assure feasible patient 
access control, the records must be stored in a 
repository (the third and final architectural 
end point) so that patients have a single point 
for indicating and revising their record access 
permissions. 

The requirement for low cost means that 
the architecture must be simple. The most 
straightforward architecture for compiling 
and accessing complete records for each per- 
son is a repository — it is easy (and therefore 
inexpensive) to implement and operate. 

Finally, the requirement for high value 
leads to the intermediate requirement to be 
able to search records across the population. 
It is those searching operations that provide 
critical information of great value for public 
health, medical research, quality improve- 
ment, and policy. Such searchability is much 
more easily achieved when the records are in a 
repository. 

Therefore, the architecture needed for a 
low cost, high benefit HII with complete 
records is a secure repository with record 
access controlled by patients. These are the 
exact characteristics of health record banks, 
which were conceived and designed to specifi- 
cally address these key HII requirements. 
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Requirement: 
Complete Records 
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= Patient Control Ensures Privacy and 

Stakeholder Cooperation 
In an HRB, everything is done with consumer 
consent, with account-holders controlling 
their copy of all their records and deciding 
who gets to see any or all of it. This protects 
privacy (since each consumer sets their own 
customized privacy policy), promotes trust, 
and ensures stakeholder cooperation since all 
holders of medical information must provide 
it when requested by the patient (Kendall 
2009). Of course, the operations of an HRB 
must be open and transparent with indepen- 
dent auditing of privacy practices. World- 
class state-of-the-art computer security is 
needed to protect the HRB, which will be a 
natural target for hackers. At least one new 
security approach is now available that abso- 
lutely prevents large-scale data loss from a 
repository of medical records (Yasnoff 2016), 
the key security requirement for an HRB (see 
subsection on HRB Security below). 

Natural concerns arise from the ability of 
the patient to suppress any or all of their 
HRB account information, which could lead 
to misdiagnosis and dangerous treatment. 
This capability could be abused by patients 
who, for example, may seek multiple prescrip- 
tions for controlled substances for the pur- 
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pose of diversion for illegal sale. With respect 
to the possibility of medical errors resulting 
from incomplete information, the patient 
would be clearly and unmistakably warned 
about this when choosing not to disclose any 
specific information (e.g., “Failure to disclose 
any of your medical information may lead to 
serious medical problems, including your 
death”). The expectation is that few people 
will choose to do this, particularly after such a 
warning. However, as noted earlier, 13-17% 
of patients already engage in this practice, 
leading many observers to conclude that the 
general public may not be comfortable with a 
system that provides easy access to their 
records unless they are in control of such 
access. This issue ultimately becomes one of 
public policy and may also be a subject of dis- 
cussion between the doctor and the patient 
(1.e., the doctor will want to be assured by the 
patient that all information is being provided). 
Clearly, physicians should not be liable for the 
consequences of the patient’s choice to with- 
hold information. 

With respect to patients who use their 
power to hide information as a way to facili- 
tate improper or illegal activity, there is clearly 
an overriding public policy concern. For 
example, in the case of controlled substances, 
it may be necessary to report to the physician 
(or, if legislatively mandated, to regulatory 
authorities) whenever a patient suppresses 
any information about controlled substance 
prescriptions. The information itself would 
still be under the patient’s control, but the 
physician would be alerted with a notice such 
as “some controlled substance prescription 
information has been withheld at the patient’s 
request.” There may be other situations where 
such warnings are needed. 


= Assuring that Standardized, Encoded 
Electronic Patient Information Is 
Consistently Deposited 

HRBs can provide ongoing incentives for 

deposits of EHR records by clinicians, which 


would help to offset the unfavorable cost/ 
benefit of EHRs for office-based providers. 
These incentives could be paid on a per- 
encounter or per-month basis. In addition, 
those few physicians who do not currently 
have EHRs could receive no-cost Internet- 
accessible EHR systems (at HRB expense) 
with the understanding that information 
from patient encounters would be automati- 
cally transferred to the HRB. Another option 
is to link reimbursement for medical services 
to HRB deposits — i.e., providers would not 
be paid unless the medical record informa- 
tion generated from those services is trans- 
mitted to an HRB. This makes sense 
economically, as the value of medical ser- 
vices is greatly limited if the information 
about patients is not readily available for 
their ongoing care. 

Incentives for HRB deposits also serve to 
ensure compliance with data standards, both 
initially and on an ongoing basis. Clearly, any 
EHR provided through the HRB would, by 
definition, transmit information back to the 
HRB in a standard format (since the HRB 
would only provide systems that can do so). 
For physicians who already have EHRs, HRB 
reimbursements for deposits from those sys- 
tems would naturally require complete and 
fully encoded encounter data using estab- 
lished standards to be sent to the HRB. Over 
time, higher levels of encoding of medical 
information can be promoted through the 
gradual introduction of more stringent stan- 
dards requirements (with plenty of lead time 
to allow for system upgrades). Compliance 
with such changes in standards could also be 
further assured through a direct relationship 
to reimbursement. 


= HRB Business Model 

Health record banking has advantages on 
both the cost and revenue sides of the busi- 
ness model; the cost is lower and the revenue 
opportunities greater. Because of the lower 
operating costs and additional functionality 
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for searching records, one can envision a vari- 
ety of business models for HRBs that do not 
depend on public subsidies or attempt to cap- 
ture any healthcare savings, but are solely 
funded through new value created for con- 
sumers and other stakeholders.*° 

Due to the simplicity of HRB operations, 
the cost is substantially less than an equiva- 
lent institution-centric architecture. For an 
HRB, providing access at the point of care 
only involves a single retrieval from the bank’s 
repository of records. In an institution-centric 
model, the records for a given patient are 
located at an arbitrary number of dispersed 
sites, and must be assembled in real-time and 
integrated into a comprehensive record before 
they can be used for patient care. Not only is 
this process of assembly complex, time- 
consuming, and prone to error, it necessitates, 
as noted above, the creation of a fully staffed 
24 x 7 NOC to monitor the availability of all 
information sources as well as troubleshoot 
and correct those that are malfunctioning. 

The estimated cost for the NOC in an 
institution-centric model is substantial. For 
example, given a population of 1,000,000, at 
least 1,000 systems would need monitoring 
(one for every 1,000 patients). Assuming a 
reasonable failure rate for fully functional 
query connectivity to each system of once/ 
year (representing a mean time between fail- 
ures [MTBF] of over 8,700 hours), there 
would be an average of 2.73 failures/day or 
0.91 failures per 8-hour shift that would need 
troubleshooting attention. A minimum staff 
for the NOC would be one person 24 x 7; 
given 21 shifts/week plus leeway for vacations 
and sick leave, this would require at least 5 
full-time equivalent staff costing about 
$200,000 each including equipment, overhead 


36 Health Record Banking Alliance. (2012) Health 
Record Banking: A Foundation for Myriad Health 
Information Sharing Business Models. White Paper 
(12 Dec 2012). Retrieval 29 Oct 2018: $ http:// 
www.healthbanking.org/uploads/9/6/9/4/9694117/ 
hrba_white_paper_business_model_dec_2012.pdf 
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and fringe benefits. Assuming an additional 
$500,000/year for hardware and software to 
operate the institution-centric system (over 
and above the data repository needed for an 
HRB) yields an annual cost of $1.5 million or 
$1.50/person/year. This would add nearly 20% 
for the institution-centric model to the esti- 
mated $8/person/year needed to operate an 
HRB (Kaelber et al. 2008). 

Beyond this, the additional costs imposed 
in the institution-centric model for each con- 
nected EHR for additional hardware, soft- 
ware, telecommunications capability, and 
additional operational expenses to maintain 
24 x 7 system availability must also be 
included. Even if such costs were only a very 
modest $1,000/year/system (less than $100/ 
month), this would result in an additional 
$1,000,000 or $1/person/year. Adding this to 
the $1.50/person/year for the NOC gives a 
total estimated cost of $2.50/person/year, 
resulting in over 30% higher costs for the 
institution-centric model than a basic 
HRB. Added to this would be the costs and 
complexity of establishing and maintaining 
data sharing agreements among all the enti- 
ties, which would be substantial. 

An HRB with comprehensive electronic 
medical records for individuals in a commu- 
nity can generate substantial value (Yasnoff 
and Shortliffe 2014). In addition to empower- 
ing physicians to provide safer, more effective, 
and more efficient care, the availability of 
such records readily enables many types of 
heretofore infeasible yet very desirable ser- 
vices for patients, providers, and other stake- 
holders. 

Perhaps the most compelling example of a 
service for patients is the “peace of mind” 
alert that immediately notifies the patient’s 
loved ones about a critical event, such as an 
emergency room visit. As soon as the emer- 
gency provider accesses the patient’s HRB 
record, the alert, which is delivered electroni- 
cally in a manner chosen by the recipient, is 
delivered. Such a service is particularly valu- 
able for children of seniors who may not be 
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immediately notified that their parent is 
undergoing emergency care. 

Another example of a patient service is the 
“preventive care reminder” that indicates to 
the patient recommended preventive interven- 
tions, e.g. a flu shot, customized based on 
demographics and history. Since the reminder 
is based on comprehensive records, it is highly 
likely to be correct, and therefore relevant, to 
the patient. Once the service is obtained, fur- 
ther reminders would cease. In this way, 
patients would not be distracted by irrelevant 
or redundant reminders. 

A third example of a patient service is 
medication refill reminders. These could be 
sent via text message, allowing the patient to 
respond briefly to indicate approval of the 
refill. While a number of pharmacies currently 
provide such services, the HRB reminders 
would be independent of pharmacy, and the 
record of the patient’s permission for such 
communications would only need to be main- 
tained in one place. 

An example of a service for providers is 
the “normal or unchanged lab result” message 
to patients. Every day, providers review results 
for previously ordered tests and communicate 
those results with their recommendations to 
their patients. With an HRB, deposits of new 
lab results that are either normal, or within a 
specified range of a previous result for that 
patient, could be automatically sent to the 
patient over the provider’s signature indicat- 
ing that all is well. It would no longer be nec- 
essary for the provider to take the time to 
review and comment manually on such results, 
which would be a significant productivity 
enhancement. 

There are many other potentially valuable 
benefits of comprehensive HRB patient 
records, such as the ability to comprehensively 
monitor clinical care for public health, query 
clinical records for research and policy pur- 
poses (with patient permission), and eliminate 
the need for separate registries of clinical 
information (e.g. diabetes, asthma, cancer, 
immunization) since all the data normally 
stored in such registries would already be 
available in the HRB. 


= HRB Security 
It has long been known that centralized data 
storage is the best way to ensure security (Turn 
et al. 1976). The reason for this is clear: dis- 
tributed data is inherently less secure since it 
must be transmitted multiple times for each 
use. However, the single source of all data pro- 
vided by a central database is also an inherent 
weakness; if all the data is accessible all-at- 
once for good purposes, it must necessarily 
also be available for misappropriation and 
misuse. Multiple recent large-scale healthcare 
security breaches (Associated Press 2015; 
Abelson and Goldstein 2015; Reuters 2015) 
validate the risk of centralizing substantial 
amounts of sensitive medical information. 
How can data for each person in a community 
be readily available while ensuring its security? 
The personal grid architecture (Yasnoff 
2016) addresses this difficult problem by stor- 
ing each patient’s information in a separate 
file, separately encrypted with its own strong 
password. The clear advantage of this 
approach is that there is no longer a single 
access point for multiple patient records. 
Indeed, if an adversary somehow obtained a 
complete copy of all the data, it would be of 
very limited value since access to each indi- 
vidual record would require breaking a strong 
encryption key, a very costly and time- 
consuming process. Thus, not only is the data 
protected, but the incentive for hackers to 
obtain the data is largely eliminated. 
However, this approach also has a serious 
drawback with respect to searching records 
across a population. To do this, each record 
must be retrieved, decrypted, and searched in 
sequence, which is prohibitively slow. Indeed, 
the reason relational and other databases are 
used for storage is that these systems “pre- 
index” the data to allow rapid retrieval across 
records. Such “pre-indexing” is not compati- 
ble with the personal grid architecture as it 
would create the single access point to all the 
data that the architecture has deliberately 
eliminated to ensure security, thereby defeat- 
ing the core purpose of the approach. 
Fortunately, the development of cloud 
computing has provided a convenient, accept- 
able solution to searching across population 
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data in the personal grid. Although sequential 
searching, which is inherently slow, is still 
required, the searching can be distributed over 
hundreds or even thousands of servers from a 
cloud computing provider. While the amount 
of computing needed remains the same, this 
parallel approach reduces the search time by 
orders of magnitude. It is estimated, for exam- 
ple, that the search time for a population of 5 
million using 500 parallel search servers would 
be just 7 minutes. In the medical environment, 
there are no use cases where response times 
for population searches need to be less than 
an hour (in fact, overnight is usually ade- 
quate), so this searching methodology can 
effectively overcome the requirement for 
sequential search that the personal grid 
imposes to ensure security. 

Some observers have suggested that block- 
chain, the increasingly popular secure distrib- 
uted ledger methodology? could be a 
useful alternative for securing medical 
records. However, blockchain has a number 
of serious drawbacks for this application: (1) 
Medical records are massively larger than 
financial ledger entries, so the storage require- 
ments for the required copies of the entire 
dataset at each of the multiple blockchain 
nodes would be prohibitive; (2) Every block- 
chain node would receive and have a copy of 
everyone’s medical records for that dataset, 
which unnecessarily increases the risk of a 
security breach even if the records are 
encrypted both in transit and at rest; (3) 
Adding records to the blockchain requires 
extensive computing resources for which 
there is no clear source of compensation; and 
(4) Security for medical records must be 
trusted by the general public, which requires 
that they understand why the records are 


37 Brakeville S and Perepa B. (2018) Blockchain 
basics: Introduction to distributed ledgers. IBM 
Developer Tutorial (31 Oct 2018). Retrieval at: 
> https://developer.ibm.com/tutorials/ 
cl-blockchain-basics-intro-bluemix-trs/ 

38 Marr B. (2017) A Complete Beginner’s Guide to 
Blockchain. Forbes (24 Jan 2017). Retrieval at: 
> https://www.forbes.com/sites/ 
bernardmarr/2017/01/24/a-complete-beginners- 
guide-to-blockchain/?sh=187ac78a6e60 
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secure. The complexity of blockchain and its 
dependence on advanced encryption and 
game theory methods makes it extremely dif- 
ficult and time consuming to fully explain to 
non-technical audiences, precluding an 
informed basis for widespread trust. In view 
of these issues, the effective, inexpensive, and 
easy-to-understand personal grid architec- 
ture is a better choice to meet the security 
requirements for medical record storage. 


= Summary Comparison of Institution- 
Centric and Patient-Centric Architectures 

O Table 15.1 summarizes the characteristics 
of the institution-centric approach to HII 
compared to the patient-centric (HRB) model. 
The patient-centric model is simpler and more 
straightforward, and deals directly with the 
issue of privacy by putting patients in control 
of their own information. Interoperability is 
much more easily accomplished in the patient- 
centric model since standards compliance can 
be reinforced with financial incentives, and 
reconciliation of inconsistencies between 
records need not be real-time. The patient- 
centric approach is financially sustainable 
with a variety of business models, and can 
provide powerful incentives to clinicians to 
deposit EHR data for their patients. Finally, 
the patient-centric model avoids the substan- 
tial processing burden on clinician EHRs 
from queries each time any patient whose 
record is stored is seen anywhere. In each of 
the categories of requirements, organizational 
issues, cost, operations, and incentives, the 
patient-centric approach has substantial 
advantages. 


15.6 Progress Towards HRBs 


15.6.1 HRB Opposition 


If, as has been clearly elucidated in this chap- 
ter, HRBs are the most effective and efficient 
solution for HII architecture, why hasn’t this 
approach been widely adopted? The short 
answer is that key healthcare stakeholders, 
such as insurers, health plans, and hospitals 
often oppose it (although typically not 
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O Table 15.1 Comparison of the institution-centric and patient-centric approaches to HIT 


Issue 


Requirements 


Privacy 


Security 


Searchability 


Completeness 


Organizational issues 


Cooperation 
needed 


Organizational 
complexity 


Cost 
Startup cost 
Operational cost 


Business model 


Operations 


IT design 


Reliability 


Interoperability 


Incentives 


Clinician 
incentives 


Clinician burden 


Institution-centric 


Patient consent difficult to implement; 
many complex data sharing agreements 
needed 


Inherently weak because records 
transmitted multiple times for each use 


Impractical to search population data 


Requires queries to all data sources each 
time a patient’s records are requested; all 
must respond for completeness 


Extensive; community-wide 


High; ongoing collaboration of multiple 
competing stakeholders necessary 


Substantial (due to high complexity) 
Inefficient/expensive 


Complex; no clear approach has emerged; 
typically requires ongoing subsidies from 
health care stakeholders 


Complex, based on “fetch and show”; 
requires queries to multiple entities, 
real-time reconciliation of inconsistencies, 
and NOC 


Prone to error (record sources unavailable) 


Compliance voluntary 


Not included 


Extensive; incoming query each time 
current patients seen anywhere increases 
EHR costs 


Patient-centric 


Simple; patients in control of all access to 
their own records; consent easy to 
implement 


Very strong using new security techniques 
that eliminate large-scale data loss from 
repositories 


Search feasible using parallel processing 
in the cloud 


Comprehensive data available at all times 
for each patient 


Unifying; HIPAA mandates records on 
patient request 


Low; HRB is neutral and independent of 
all stakeholders 


Moderate 
Efficient/inexpensive 


Sustainable using new value of complete 
information; many options possible 
funded by patients/payers/purchasers 


Simple, based on “deposit once to 
account”; no secondary queries or 
real-time reconciliation needed; NOC 
unnecessary 


Reliable; one operation to retrieve 
individual patient data 


Compliance can be assured with financial 
incentives 


Easy to include 


Minimal; information deposited once in 
HRB; no incoming queries 
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openly). However, this is an oversimplification 
of the complex incentives involved. 

For insurers, their claims data provides the 
most comprehensive picture of individual 
health care service usage. While this informa- 
tion is not sufficiently detailed to facilitate 
clinical management of patients, it gives the 
insurers the “informational advantage” in 
negotiations with their customers, employers 
that purchase insurance for their employees, 
and their payees, physicians, hospitals, and 
health plans. HRBs would make much more 
detailed information about each patient 
potentially available to all healthcare stake- 
holders, eliminating this informational advan- 
tage. In addition, HRBs have the potential to 
reduce health care costs. While this would 
seem to be positive for insurers (making their 
offerings less expensive to employers), it also 
would reduce their profitability, which is typi- 
cally a percentage of those health care costs. 
For these reasons (and perhaps others), insur- 
ers have worked actively against HRB devel- 
opment, for example, by legislatively derailing 
promising efforts to jump start HRB develop- 
ment in Washington State. Of course, since 
comprehensive patient-controlled electronic 
records are so appealing to the public, the 
opposition of insurers has been behind the 
scenes. The latter is a theme that is common to 
all the stakeholder opposition, as it is difficult 
to openly justify opposing the safer, more 
effective, more efficient care likely to result 
from HRBs. 

Health plans and hospitals have somewhat 
different reasons for opposing HRBs. Mostly, 
this involves concerns about empowering their 
competitors. Typically, the largest hospitals 
and health plans in each region have more 
complete information than their smaller com- 
petitors, and therefore see HRBs as weaken- 
ing or eliminating that perceived market 
advantage. Sadly, this is true despite the fact 
that HRBs would allow all providers to deliver 
better care. Another issue, particularly for 
hospitals, is that they are concerned about the 
idea of sharing their electronic patient infor- 
mation after spending hundreds of millions 
of dollars (or more) to install and operate 
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EHR systems. The financial interests of health 
plans and hospitals also appear to conflict 
with the deployment of HRBs, since their 
adoption would likely reduce their fee-for- 
service incomes by eliminating unnecessary 
and duplicative care. Again, this opposition is 
behind the scenes to avoid having to openly 
advocate for higher institutional income over 
patient safety and effective, efficient care. 

In a situation such as this, where progress 
that would benefit everyone (i.e., HRBs) is 
opposed by specific groups, one would hope 
that the government, representing the inter- 
ests of all the people, would help ensure that 
the good of everyone prevails. However, in the 
U.S., the ability of the government to impose 
major changes is quite limited (due to the sys- 
tem of “checks and balances”), and is espe- 
cially so when the proposed changes are 
opposed by many key stakeholders. It has 
therefore been relatively easy for major health- 
care stakeholders to delay and/or redirect 
HRB efforts by, for example, advocating for 
other solutions, raising seemingly valid objec- 
tions to reasonable efforts to move forward, 
or creating organizations that appear to be 
working towards a solution while in reality 
being dedicated to maintaining the status quo. 


15.6.2 Factors Accelerating HRB 
Progress 


Perhaps the most important factor accelerat- 
ing HRB progress is the recent and ongoing 
change in the reimbursement system for care. 
The move from “fee for service” to “pay for 
value” is changing the incentives for all health 
care stakeholders. Under “fee for service,” the 
more efficient care enabled by HRBs would 
directly translate to lower reimbursement, a 
clearly undesirable outcome for providers. But 
a “pay for value” system financially rewards 
efficiency, thereby creating a strong incentive 
for the more complete and timely patient 
information that HRBs can make available. 
While the transition from “fee for service” to 
“pay for value” is still in its relatively early 
stages, it is reasonable to expect that ongoing 
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progress in this direction will be accompanied 
by growing support for the HRB architecture. 

In addition, the growing recognition of an 
“individual right” to personal data ownership 
and control is also an enabling force for HRB 
adoption. The adoption of the EU General 
Data Protection Rule” as well as the wide- 
spread backlash against use of individual data 
for profit without permission of the data 
subjects*'*1,# is creating a higher level of 
awareness and support for personal data con- 
trol. This is likely to accelerate the demand for 
organizations such as HRBs that can provide 
individuals with the ability to capture and 
control their sensitive medical record infor- 
mation. 

There are at least four additional factors 
that are currently facilitating and even accel- 
erating the development of health record 
banks (HRBs): 
= Patient records are now largely electronic. 

— This is a necessary prerequisite for any 
HII approach. Thanks largely to the 
subsidies to providers and hospitals to 
acquire EHR systems, most patient 
records are now digital. 

= Effective standards are available. 

— Standards for transmission of EHR 
data have evolved. While not perfect, 
the HL7 CCDA standard is an effective 
methodology for transmitting and 
receiving patient data. Other potentially 
even more effective standards, such as 


39 Retrieval 29 Oct 2018: $ https://gdpr-info.eu 

40 Breland A. (2017) Tech faces public anger over 
internet privacy repeal. The Hill (2 Apr 2017). 
Retrieval 29 Oct 2018: > https://thehill.com/ 
policy/technology/3268 1 6-tech-faces-public-anger- 
over-internet-privacy-repeal 

41 King C. (2018) Tech Industry Pursues a Federal 
Privacy Law, on Its Own Terms. New York Times 
(26 Aug 2018). Retrieval 29 Oct 2018: $ https:// 
www.nytimes.com/2018/08/26/technology/ 
tech-industry-federal-privacy-law. html 

42 Wakabayashi D. (2018) California Passes Sweeping 
Law to Protect Online Privacy. New York Times 
(28 Jun 2018). Retrieval 29 Oct 2018: $ https:// 
www.nytimes.com/2018/06/28/technology/ 
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FHIR (Fast Healthcare Interoperability 
Resources), are rapidly evolving. 
= Smart phones are nearly ubiquitous 
— Smart phones provide a convenient and 
readily accessible mechanism for indi- 
viduals to access and control their digi- 
tal health records. 
= New computer security methods can 
prevent large-scale breaches 
— Community-based repositories of digi- 
tal health records must be able to reli- 
ably prevent large-scale breaches. The 
new “personal health grid” architecture 
does this in an easy-to-understand way. 


Finally, as mentioned earlier, there are now 
several successful examples of HRB imple- 
mentations outside the U.S. As the benefits of 
these systems are documented and become 
more widely known, this should also increase 
the demand for HRBs throughout the world. 


15.7 Evaluation 


The last element in the strategy for promoting 
a complex and lengthy project such as the HII 
is evaluation to both gauge progress and define 
a complete system. Evaluation measures 
should have several key features. First, they 
should be sufficiently sensitive so that their 
values change at a reasonable rate (a measure 
that only changes value after 5 years will not 
be particularly helpful). Second, the measures 
must be comprehensive enough to reflect 
activities that affect most of the stakeholders 
and activities needing change. This ensures 
that efforts in every area will be reflected in 
improved measures. Third, the measures must 
be meaningful to policymakers. Fourth, peri- 
odic determinations of the current values of 
the measures should be easy so that the mea- 
surement process does not detract from the 
actual work. Finally, the totality of the mea- 
sures must reflect the desired end state so that 
when the goals for all the measures are 
attained, the project is complete. 

A number of different types or dimen- 
sions of measures for HII progress are possi- 
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ble. Aggregate measures assess HII progress 
over the entire nation. Examples include the 
percentage of the population covered by an 
HII and the percentage of healthcare person- 
nel who utilize EHRs. Another type of mea- 
sure is based on the setting of care. Progress 
in implementation of EHR systems in the 
inpatient, outpatient, long-term care, home, 
and community environments could clearly 
be part of an HII measurement program. Yet 
another dimension is healthcare functions 
performed using information systems sup- 
port, including, for example, registration sys- 
tems, decision support, and CPOE. Finally, it 
is also important to assess progress with 
respect to the semantic encoding of EHRs. 
Clearly, there is a progression from the elec- 
tronic exchange of images of documents, 
where the content is only readable by the end 
user viewing the image, to fully standardized 
and encoded EHRs where all the information 
is indexed and accessible in machine-readable 
form. 

Sadly, the evidence is now overwhelming 
that U.S. HIEs in their current form are, with 
rare exceptions, not succeeding. Labkoff and 
Yasnoff described four criteria for the quanti- 
tative evaluation of HII progress in communi- 
ties: (1) completeness of information, (2) 
degree of usage, (3) types of usage, and (4) 
financial sustainability (Labkoff and Yasnoff 
2007). Using these criteria, four of the most 
advanced community HII projects in the U.S. 
achieved scores of 60-78% (on a 0-100 scale), 
indicating substantial additional work was 
required before the HII could be viewed as 
complete. 

The 2010 PCAST report stated, “HIEs 
have drawbacks that make them ill-suited as the 
basis for a national health information architec- 
ture’ (PCAST 2010). Among those draw- 
backs, PCAST cited administrative burdens 
(data sharing agreements to ensure stake- 
holder cooperation), financial sustainability, 
interoperability, and an architecture that can- 
not be scaled effectively. A recent survey of 
HIEs (Adler-Milstein et al. 2011) found only 
13 HIEs in the U.S. (covering 3% of hospitals 
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and 0.9% of physician practices) capable of 
meeting Stage 1 Meaningful Use criteria, and 
even those metrics by no means ensure the 
availability of comprehensive electronic 
patient information when and where needed. 
Of those, only 6 were reported to be finan- 
cially viable. More importantly, none of the 
HIEs surveyed had the capabilities of a com- 
prehensive system as specified by an expert 
panel. 

Overall, the current approaches to build- 
ing HII consistently fail to meet one or more 
of the requirements described above: privacy, 
stakeholder cooperation, ensuring fully elec- 
tronic information, financial sustainability, 
and independent governance. While these 
problems are highly interdependent, it is use- 
ful to consider them in the context of the deci- 
sions that communities have made about HII 
architecture, privacy, and business model that, 
while appearing attractive to stakeholders in 
the short term, have so far been largely unsuc- 
cessful. Exploration and large-scale testing of 
alternative approaches that directly address 
the requirements, such as health record bank- 
ing, seem both necessary and increasingly 
urgent. 


15.8 Conclusions 


While progress has been made and efforts are 
continuing, successful development and oper- 
ation of comprehensive HII systems remains 
a largely unsolved problem in the U.S. Happily, 
we are now seeing successful HII implementa- 
tions in other countries that can provide 
important examples of feasible and effective 
systems. The extensive focus on building HII 
systems has greatly improved our understand- 
ing of the requirements, barriers, and chal- 
lenges, as well as potential solutions. Despite 
the daunting obstacles, the benefits of HII are 
sufficiently urgent and compelling to ensure 
major ongoing work in this domain. Through 
these activities, the HII path to comprehen- 
sive electronic patient records when and where 
needed is becoming clearer, and substantial 
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additional progress is highly likely over the 
next few years. 
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www.itif.org/publications/improving-health- 
care-why-dose-it-may-be-just-what-doctor- 
ordered. This is the first independent report 
that endorsed patient-centric architecture 
(HRBs) as an effective approach to HII. It 
describes clearly the problems and challenges 
of HIEs. 

Kendall, D., & Quill, E. (2015, May 28). A life- 
time electronic health record for every 
American. Third Way. Retrieval 29 Oct 2018: 
https://www.thirdway.org/ 
report/a-lifetime-electronic-health-record-for- 
every-american. This paper is a more recent 
endorsement of the HRB concept by an inde- 
pendent think tank. 

Krist, A. H., & Woolf, S. H. (2011) A vision for 
patient-centered health information systems. 
Journal of the American Medical Association, 
305(3):300-301. A vision of how fully func- 
tional patient-centric electronic medical 
record systems could be the basis for an effec- 
tive HII. 

Lapsia, V., Lamb, K., & Yasnoff, W. A. (2012). 
Where should electronic records for patients 
be stored? International Journal of Medical 
Informatics, 81(12), 821-827. This paper eluci- 
dates clearly the advantages of patient-centric 
architecture by comparing it via simulation to 
an institution-centric approach. 

Miller, R. H., & Miller, B. S. (2007). The Santa 
Barbara County Care Data Exchange: What 
happened? Health Affairs, 26(5), w568—w580. 
This paper describes the history of one of the 
earliest HIEs, including details about the fac- 
tors leading to its failure. 

Mikk, K. A., Sleeper, H. A., & Topol, E. J. (2017). 
The pathway to patient data ownership and 
better health. JAMA, 318(15), 1433-1434. 
This paper describes how patient data owner- 
ship and records stored in repositories can 
result in an effective HII. 

National Committee on Vital and Health Statistics. 
(2001). Information for health: A strategy for 


building the National Health Information 
Infrastructure. Report and recommendations 
from the National Committee on Vital and 
Health Statistics. Retrieval 31 Oct 2018: https:// 
aspe.hhs.gov/report/information-health-strat- 
egy-building-national-health-information- 
infrastructure. This seminal work was the first 
to call for a national HII, coining the term. It 
comprehensively describes the need for HII, the 
problems it would solve, and the necessity for 
government investment to incentivize its devel- 
opment. 

Yasnoff, W. A. (2016). A secure and efficiently 
searchable health information architecture. 
Journal of Biomedical Informatics, 61, 237- 
246. This paper describes a new and innova- 
tive personal grid architecture for digital 
health records that absolutely prevents large- 
scale data loss, an essential capability to 
ensure user trust in large central repositories 
of health records. 

Yasnoff, W. A., & Shortliffe, E. H. (2014). Lessons 
learned from a health record bank start-up. 
Methods of Information in Medicine, 53, 
66-72. This paper gives a detailed post-mor- 
tem of a health record bank startup and 
includes specific recommendations for future 
success. 

Yasnoff, W. A., Humphreys, B. L., Overhage, 
J. M., Detmer, D. E., Brennan, P. F., Morris, 
R. W,, Middleton, B., Bates, D. W., & Fanning, 
J. P. (2004). A consensus action agenda for 
achieving the national health information 
infrastructure. Journal of the American 
Medical Informatics Association, 11(4), 332- 
338. This paper describes the results of the 
first national consensus conference on HII 
held in Washington, DC, in 2003. This was the 
meeting that led to the creation of ONC in 
2004. 

Yasnoff, W. A., Sweeney, L., & Shortliffe, E. H. 
(2013). Putting health IT on the path to suc- 
cess. Journal of the American Medical 
Association, 309(10), 989-990. This paper pro- 
vides a concise overview of the problems with 
HIE and the rationale for HRBs. 


(?) Questions for Discussion 
1. Make the case for and against investing 
$billions in the HII. How successful 
have the HITECH Meaningful Use 


Health Information Infrastructure 


incentives been in promoting HII 
development? What could have been 
done differently to make them more 
effective? 

2. What organizational options would you 
consider if you were beginning the 
development of HII? What are the pros 
and cons of each? How would you pro- 
ceed with making a decision about 
which one to use? 

3. Estimate the required bandwidth and 
transaction rate for patient-centric (HRB) 
vs. institution-centric HII architecture. 

4. Consider the policy implications of uni- 
versal availability of comprehensive 
electronic patient records. What are the 
risks and how could they be mitigated? 

5. Given the architectural and other advan- 
tages of HRBs, why have most commu- 
nities adopted institution-centric 
architectures up to now? What are some 
steps that might be helpful in encourag- 
ing communities to evaluate alternative 
architectures such as HRBs? 

6. Show specifically the potential locations 
where patient consent functionality 
could be added to the institution-centric 
and patient-centric HII architectures in 
O Figs. 15.2 and 15.4 and describe the 
granularity of consent that would be 
possible at each proposed location. 
After eliminating any redundant 
functionality, compare and contrast the 
consent implementation issues for the 
two alternative architectures, describing 
the advantages and disadvantages of 
each. Which architecture more 
efficiently addresses the issue of patient 
consent? Why? 
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© Learning Objectives 
After reading this chapter, you should know 
the answers to these questions: 
= What are the primary information 
requirements of health care organiza- 


tions (HCOs)? 
= What are the clinical, financial, and 
administrative functions provided 


by health care information systems 
(HCISs), and what are the potential 
benefits of implementing such systems? 
How have changes in health care deliv- 
ery models changed the scope and 
requirements of HCISs over time? 
How do differences among business 
strategies and organizational structures 
influence information systems choices? 
What are the major challenges to imple- 
menting and managing HCISs? 

How are ongoing health care reforms, 
technological advances, and chang- 
ing social norms likely to affect HCIS 
requirements in the future? 


16.1 Overview 

Health care organizations (HCOs), and 
Integrated Delivery Networks (IDNs) like 
many other business entities, are information- 
intensive enterprises. Health care person- 
nel require sufficient data and information 
management tools to make appropriate deci- 
sions. At the same time, they need to care for 
patients and manage and run the enterprise; 
they also need to document and communicate 
plans and activities, and to meet the require- 
ments of numerous regulatory and accredit- 
ing organizations. Clinicians assess patient 
status, plan patient care, administer appropri- 
ate treatments, and educate patients and fami- 
lies regarding clinical management of various 
conditions. They are also concerned about 
evaluating the clinical outcomes, quality, 
and increasingly, the cost of health services 
provided. Administrators determine appro- 
priate staffing levels, manage inventories of 
drugs and supplies, and negotiate payment 
contracts for services. Governing boards make 
decisions about whether to invest in new busi- 


545 


ness lines, how to partner with other organi- 
zations, and how to eliminate underutilized 
services. Collectively, health care professionals 
comprise a heterogeneous group with diverse 
objectives and information requirements, and 
in the end, all are expected to be focused on 
the patients who are, after all, the reason for 
all of this. 

The purpose of health care information 
systems (HCISs) is to support the access, pro- 
cessing, and management of the information 
that health care professionals need to perform 
their jobs effectively and efficiently. HCISs 
facilitate communication, integrate informa- 
tion, and coordinate action among multiple 
health care professionals—and increasingly, 
patients. In addition, HCISs organize and 
store substantial amounts of data and they 
support record-keeping and reporting func- 
tions. Many of the clinical information 
functions of an HCIS were detailed in our 
discussion of the computer-based patient 
record (CPR) in > Chap. 14; systems to sup- 
port nurses and other care providers are dis- 
cussed in ® Chap. 17. Furthermore, HCISs 
are key elements that interface with the 
health information infrastructure (HII), as 
discussed in > Chap. 15. An HCIS also sup- 
ports the financial and administrative func- 
tions of a health organization and associated 
operating units, including the operations of 
ancillary and other clinical-support depart- 
ments. The evolving complexities of HCOs 
place great demands on an HCIS. Many 
HCOs are broadening their scope of activi- 
ties to cover the care continuum, partially in 
response to Accountable Care Organization 
(ACO), Value-based Purchasing and Bundled 
Payment initiatives from the federal govern- 
ment. HCISs must organize, manage, and 
integrate large amounts of clinical and finan- 
cial data collected by diverse users in a vari- 
ety of organizational settings (from patient 
homes to physicians’ offices to hospitals to 
health care systems) and must provide health 
care workers (and, increasingly, patients) with 
timely access to complete, accurate, and up- 
to-date information presented in a useful for- 
mat. The diversity and extent of the modern 
IDN is illustrated in @ Fig. 16.1. 


16 


546 L. H. Vogel and W. C. Reed 


Academic 
Institution 


Community 
Hospital 


UE, 


Physician 
Group 
Practice 


Lab M 


Reference 
N 
il 


Clinics 


Patients 


A as 


Physician it 
Group Practice . 
Clinics 


iH 


Suppliers 


ee 


Community 
Hospital 
Be") 
yy Payers 
Firewall 
Home 
Care 


Health care Organization 
Information 
Infrastructure 


O Fig. 16.1 Major organizational components of an 
integrated delivery network (IDN). A typical IDN 
might include several components of the same type 
(e.g., clinics, community hospitals, physician group 


16.2 Historical Evolution 
of the Technology of Health 
Care Information Systems 
(HCISs) 


Technological advances as well as changes in 
the information and organizational require- 
ments of HCOs, have driven many of the 
changes in system architecture, hardware, 
software, and functionality of HCISs over 
time. The tradeoff between functionality and 
ease of integration is an important factor that 
influences the choices that vendors have made 
in systems design. 


practices, etc.). Components within the same geographic 
area may have direct data connections, but increasingly 
the Internet is the preferred way to connect organiza- 
tional components 


Central and Mainframe- 
Based Systems 


16.2.1 


The earliest HCISs (typically found in hos- 
pitals) were designed according to the phi- 
losophy that a single comprehensive or 
central computer system could best meet an 
HCO’s information processing requirements. 
Advocates of the centralized approach empha- 
sized the importance of first identifying all the 
hospital’s information needs and then design- 
ing a single, unified framework to meet these 
needs. Patient management and billing func- 
tions were the initial focus of such efforts. One 


Management of Information in Health Care Organizations 


result of this design goal was the development 
of systems in which a single large computer 
performed all information processing and 
managed all the data files using application- 
independent file-management programs—ini- 
tially focusing almost exclusively on financial 
and billing data. Users accessed these systems 
via general-purpose video-display termi- 
nals (VDTs) affectionately known as “green 
screens” because the displayed numbers and 
text were often green on a dark background. 

Central systems integrated and communi- 
cated information well because they provided 
users with a centralized data store and a sin- 
gle, standardized method to access informa- 
tion simply and rapidly. On the other hand, 
the biggest limitation of central systems was 
their inability to accommodate the diverse 
needs of individual departments. There is a 
tradeoff between the uniformity (and rela- 
tive simplicity) of a generalizable system and 
the nonuniformity and greater responsive- 
ness of custom-designed systems that solve 
specific problems for specialized depart- 
ments. Generality—a characteristic that 
enhances communication and data integra- 
tion in a homogeneous environment—can be 
a drawback in an HCO because of the com- 
plexity and heterogeneity of the information- 
management tasks. Asa result, central systems 
have proved too unwieldy and inflexible to 
support evolving HCO requirements, except 
in smaller facilities. 


16.2.2 Departmental Systems 


By the late 1970s, departmental systems had 
begun to emerge. Advances in technology 
resulted in decreases in the price of hard- 
ware and improvements in software, making 
it feasible for individual departments within 
a hospital to acquire and operate their own 
computers. In a departmental system, one 
or a few computers can be dedicated to pro- 
cessing specific functional tasks within the 
department. Distinct software application 
modules carry out specific tasks, and a com- 
mon framework, which is specified initially, 
defines the interfaces that will allow data to 
be shared among the modules. Radiology 
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(> Chap. 22) and Laboratory systems are 
examples of these types of systems. As HCOs 
increasingly selected the best functionality for 
their departmental systems, their IT strategy 
became known as best of breed. 

The departmental approach responded 
to many of the challenges of central systems. 
Although individual departmental systems 
are constrained to function with predefined 
interfaces, they do not have to conform to 
the general standards of an overall system, 
so they can be designed to accommodate the 
special needs of specific areas. For example, 
the processing capabilities and file structures 
suitable for managing the data acquired from 
a patient-monitoring system in the intensive- 
care unit (analog and digital signals acquired 
in real time) differ from the features that are 
appropriate for a system that reports radi- 
ology results (image and text storage and 
processing). Furthermore, modification of 
departmental systems, although laborious 
with any approach, is simpler because of the 
smaller scope of the system. The price for 
this greater flexibility is increased difficulty in 
integrating data and communicating among 
modules of the HCISs. In reality, installing a 
system is never as easy as simply plugging in 
the connections. 

The challenge of sharing data among many 
different information systems that emerged in 
the 1980s and 1990s was daunting. As noted 
earlier, the various components of the HCISs 
were in most cases developed by different 
vendors, using different hardware (e.g., DEC, 
IBM), operating systems (e.g., PICK, Altos, 
DOS, VMS, MUMPS on minicomputers, and 
IBM’s 360 OS on mainframes) and program- 
ming languages (e.g., BASIC, PL/I, COBOL, 
MUMPS, and even Assembler). Sharing 
data among two different systems typically 
required a two-way interface—one to send 
data from System A to System B, the other to 
send data or acknowledge receipt from B back 
to A. Adding a third system didn’t require sim- 
ply one additional interface because the new 
system would in many cases have to be inter- 
faced to both of the original systems, resulting 
in the possibility of six interfaces. Introducing 
a fourth system into the HCIS environment 
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# Systems # Interfaces 


. B-B. 


Number of Potential Interfaces = n(n-1) 


O Fig. 16.2 The challenge of moving data from one 
system to another becomes complicated with the addi- 
tion of each new system. Considering that even small 
size hospitals may have several hundred applications, 
interfacing is a major challenge. While not all systems 
need to have two-way interfaces to every other system, 
this figure illustrates the challenges that even small num- 
bers of systems bring 


increased the complexity further, since it often 
meant the need for two-way interfaces to each 
of the original three systems, for a total of 
12 (8 Fig. 16.2). With the prospect of inter- 
faces increasing exponentially as new systems 
were added (represented by the formula, I = n 
(n 1) where I represents the number of inter- 
faces needed and n represents the number of 
systems), it was clear that a new solution was 
needed to address the complexity and cost of 
interfacing. 

In response, an industry niche was born 
which focused on creating a hardware and 
software application combination designed 
to manage the interfacing challenges among 
disparate systems in the HCIS environment. 
Instead of each system having to interface to 
every other system independently, an interface 
engine served as the central connecting point 
for all interfaces (@ Fig. 16.3). Each system 
had only to connect to the interface engine; 
the engine would then manage the sending of 
data to and from any other system that needed 
it. The interface engine concept, which origi- 
nated in health care, has given rise to a whole 


# Systems # Interfaces 


IE: 2E 


Number of Potential Interfaces = n x 2 


O Fig. 16.3 The introduction of the Interface Engine 
(IE) made system interfaces much more manageable, 
particularly so with the implementation of HL7 data 
messaging standards. With an IE, each additional sys- 
tem only added two additional interfaces to the mix, one 
to send data and one to acknowledge receipt of the data 


series of strategies for managing multiple sys- 
tems. Many of the vendors who got their start 
in health care interfacing subsequently found 
new markets in financial services as well as 
other industries. 

Continued advancements in departmen- 
tal systems were made possible as comput- 
ing power increased and corresponding costs 
decreased through the advent of microcom- 
puters. Even smaller ancillary departments 
such as Respiratory Therapy, which previ- 
ously could not justify a major computer 
acquisition, could now purchase dedicated file 
servers and workstations and participate in 
the HCIS environment. Health care providers 
in nursing units or at the bedside, physicians 
in their offices or homes, and managers in the 
administrative offices could eventually access 
and analyze data locally using what were ini- 
tially termed microcomputers (later known as 
desktop personal computers or PCs). 


16.2.3 Integration Challenges 


In the early 1980s, researchers at the 
University of California, San Francisco 
(UCSF) Hospital successfully implemented 
one of the first Local Area Networks (LANs) 
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to support communication among several 
of the hospital’s standalone systems. Using 
technology developed at the Johns Hopkins 
University, they connected minicomputers 
that supported patient registration, medical 
records, radiology, the clinical laboratory, 
and the outpatient pharmacy. Interestingly, 
each of the five computer systems was differ- 
ent from the others: the computers were made 
by different manufacturers and ran different 
operating systems but were able to commu- 
nicate with each other through standardized 
communications protocols. 

By the late 1980s, HCISs based on evolv- 
ing network-communications standards were 
being developed and implemented in HCOs. 
As distributed computer systems, connected 
through electronic networks, these HCISs 
consisted of a federation of independent 
systems that had been tailored for specific 
application areas. The computers operated 
autonomously and shared data (and some- 
times programs and other resources, such as 
printers) by exchanging information over a 
local area network (LAN) using standard 
protocols such as TCP/IP and Health Level 
7(HL7) for communication and in many cases 
utilizing the interface engine strategy we dis- 
cussed earlier in > Sect. 16.2.2. One advan- 
tage of LAN-connected distributed systems 
was that individual departments could have 
greater flexibility in choosing hardware and 
software that optimally suited their specific 
needs. On the downside, the distribution 
of information processing capabilities and 
responsibility for data among diverse systems 
made the tasks of data integration, communi- 
cation, and security more difficult—a fact that 
continues to the present day. Development of 
industry-wide standard network and interface 
protocols such as TCP/IP and HL7 has eased 
the technical problems of electronic commu- 
nication considerably. 

Still, there are problems to overcome in 
managing and controlling access to a patient 
database that is fragmented over multiple 
computers, each with its own file structure and 
method of file management. Furthermore, 
when no global architecture or vocabulary 
standards are imposed on the HCISs, indi- 
vidual departments and entities may encode 
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data values in ways that are incompatible with 
the definitions chosen by other areas of the 
organization. The promise of sharing among 
independent departments, entities, and even 
independent institutions has increased the 
importance of defining clinical data standards 
(see > Chap. 7) As noted earlier, some HCOs 
pursued a best of breed strategy in which they 
chose the best system, regardless of vendor 
and technology, then worked to integrate that 
system into their overall HCIS environment. 
Some HCOs modified this strategy by choos- 
ing suites of related applications, e.g., select- 
ing all ancillary systems from a single vendor 
(also known as best of cluster), thereby reduc- 
ing the overall number of vendors they work 
with and, in theory, reducing the costs and dif- 
ficulty of integration. 

Commercial software vendors have sup- 
ported this strategy by broadening their offer- 
ings of application suites and managing the 
integration at the suite level rather than at the 
level of individual applications. For example, 
Epic, one the of major enterprise-wide HCIS 
vendors, started in the late 1960s with a focus 
on physician billing systems, and over the sub- 
sequent 40 years evolved into a fully integrated 
product suite encompassing ambulatory, 
inpatient, ancillary and billing functionality. 
Cerner, another major enterprise-wide HCIS 
vendor, started with a suite of products for 
ancillary systems such as clinical laboratories, 
pharmacy and radiology, and over a similar 
time period evolved into a full-service product 
suite including inpatient, outpatient and bill- 
ing functionality. 

With the increasing availability of single 
vendor systems, the need to develop and 
implement applications that support the 
specific functionality has diminished signifi- 
cantly. PC-based universal workstations are 
now the norm and some HCOs and IDNs 
support thousands of PCs in enterprise-wide 
networked environments. The requirement for 
direct access to independent ancillary systems 
has been largely eliminated not only by enter- 
prise data networks, but by vendors offering 
integrated product suites which include sup- 
port for general clinical as well as ancillary 
service functionality. Where specialized sys- 
tems remain, interfaces join such systems to 
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a core clinical system or to a centralized clini- 
cal data repository. These permit the ability 
to access patient databases (by clinicians), 
human resources documents (by administra- 
tors and employees), financial information 
(by administrators) and basic information 
about facilities, departments, and staff (by the 
public) through a single enterprise-wide data 
network (See @ Fig. 16.1). 

Historically, as more information systems 
were added to the HCIS environment, the 
challenge of moving data from one system to 
another became overwhelming. In response, 
beginning in the 1980s, two unique develop- 
ments occurred: (1) the interface engine; and 
(2) Health Level Seven (HL7), a standard for 
the structure of the data messages that were 
being sent from one information system to 
another (see > Chap. 7). 

The creation of HL7 was yet another 
response to the challenge of moving data 
among disparate health care systems. HL7 is 
a health care-based initiative focused on the 
development of data messaging standards for 
the sharing of data among the many individ- 
ual systems that comprise an HCIS. The basic 
idea was to use messaging standards so that 
data could be sent back and forth using stan- 
dard formats within the HCIS environment. 
Most of the departmental systems that were 
introduced at this time were the products of 
companies focused on specific niche markets, 
including laboratories, pharmacies and radi- 
ology departments. Consequently, there was 
strong support for both the interface engine 
and the HL7 efforts as mechanisms to per- 
mit smaller vendors to compete successfully 
in the marketplace. In recent years, many of 
these pioneering vendors have been purchased 
and their products included as components of 
larger single vendor product families. 

In hospitals, clinical and administrative 
personnel have traditionally had distinct areas 
of responsibility and performed many of their 
functions separately. Thus, it is not surpris- 
ing that administrative and clinical data have 
often been managed separately—administra- 
tive data in business offices and clinical data in 
medical-records departments. When comput- 
ers were first introduced, the hospital’s infor- 
mation processing was often performed on 


separate computers with separate databases, 
thus minimizing conflicts about priorities in 
services and investment. In addition, infor- 
mation systems to support hospital functions 
and ambulatory care historically have, due to 
organizational boundaries, developed inde- 
pendently. Many hospitals, for example, have 
rich databases for inpatient data but maintain 
less information for outpatients. As fee-for- 
service reimbursement models continue to be 
challenged for their focus on activity-driven 
care, alternatives such as ACOs, bundled pay- 
ments for services, and pay for performance 
proposals will stimulate efforts toward greater 
data integration. 

By the late 1980s, clinical information 
system (CIS) components of HISs offered 
clinically oriented capabilities, such as order 
writing and results communications. During 
the same period, ambulatory medical record 
systems (AMRSs) and practice management 
systems (PMSs) were being developed to sup- 
port large outpatient clinics and physician 
offices, respectively. These systems performed 
functions analogous to those of hospital sys- 
tems, but were generally less complex, reflect- 
ing the complexity of patient care delivered 
in outpatient settings. However, there has his- 
torically been little or no systems integration 
between hospital and ambulatory settings. 

The historical lack of integration of data 
from diverse sources creates a host of prob- 
lems. If clinical and administrative data are 
stored on separate systems, then data needed 
by both must either be entered separately 
into each system, be copied from one system 
to another, or data from both sources trans- 
ferred to yet another location to be analyzed. 
In addition to the expense of redundant data 
entry and data maintenance incurred by these 
approaches (see also the related discussion 
for the health information infrastructure in 
> Chap. 15), the consistency of information 
tends to be poor because data may be updated 
in one place and not in the other, or informa- 
tion may be copied incorrectly from one place 
to another. In the extreme example, the same 
data may be represented differently in differ- 
ent settings. As we noted earlier within the 
hospital setting, many of these issues have 
been addressed through the development of 
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automated interfaces to transfer demographic 
data, orders, results, and charges between 
clinical systems and billing systems. Even with 
an interface engine managing data among dis- 
parate systems, however, an organization still 
must solve the thorny issues of synchroniza- 
tion of data and comparability of similar data 
types. 

With the development of IDNs and other 
complex HCOs, the sharing of data elements 
among operating units becomes both more 
critical and more problematic. Data integra- 
tion issues are further compounded in IDNs 
by the acquisition of previously independent 
organizations that have clinical and admin- 
istrative information systems incompatible 
with those of the rest of the IDN. It is still 
not unusual to encounter minimal automated 
information exchange among organizations 
even within an IDN. Patients register and 
reregister at the physician’s office, diagnos- 
tic imaging center, ambulatory surgery facil- 
ity, and acute-care hospital—and sometimes 
face multiple registrations even within a single 
facility. Each facility may continue to keep its 
own clinical records, and shadow files may 
be established at multiple locations with cop- 
ies of critical information such as operative 
reports and hospital discharge summaries. 
Inconsistencies in these multiple electronic 
and manual databases can result in inappro- 
priate patient management and inappropriate 
resource allocation. For example, medica- 
tions that are first given to a patient while 
she is a hospital inpatient may inadvertently 
be discontinued when she is transported to a 
rehabilitation hospital or nursing home. Also, 
information about a patient’s known allergies 
and medication history may be unavailable to 
physicians treating an unconscious patient in 
an emergency department. 

The objectives of coordinated, high- 
quality, and cost-effective health care cannot 
be completely satisfied if an organization’s 
multiple computer systems operate in isola- 
tion. Unfortunately, free-standing systems 
within HCOs are still common, although 
HCOs and IDNs are increasingly investing in 
the implementation of new more consistent 
systems across all their facilities or in integrat- 
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ing existing systems to allow data sharing. 
The capital investment required to pursue a 
strategy of system-wide data integration can 
be significant, and with ongoing challenges to 
reimbursement rates for both hospitals and 
physicians, the funding to pursue this strat- 
egy is often limited either due to competing 
investment requirements (e.g., acquiring or 
maintaining buildings and equipment) or the 
continued downward trend in reimbursement 
for services. 


16.2.4 Evolution to Enterprise- 
Wide Health Care System 
Information Systems 


If an HCO or IDN is to manage patient care 
effectively, project a focused market identity, 
and control its operating costs, it must perform 
in a unified and consistent manner. For these 
reasons, information technologies to support 
data and process integration are recognized 
as critical to an IDN’s or HCO’s operations. 
From an organizational perspective, informa- 
tion should be available when and where it is 
needed; users must have an integrated view, 
regardless of system or geographic boundar- 
ies; data must have a consistent interpreta- 
tion; and adequate security must be in place 
to ensure access only by authorized personnel 
and only for appropriate uses. Unfortunately, 
these criteria are much easier to describe than 
to meet. 

Over time, changes in the health care 
economic and regulatory environments have 
radically transformed the structure, strategic 
goals, and operational processes of health 
care organizations through a gradual shift- 
ing of financial risk from third party payers 
(e.g., traditional insurance companies such 
as Blue Cross and Blue Shield, Medicare and 
Medicaid programs that emerged in the 1960s 
and 1970s) to the providers themselves and 
in many cases directly to the patient through 
higher deductibles, co-pays, benefit caps, etc. 
This shifting of risk initially brought about 
a consolidation of health care providers into 
integrated delivery networks (IDNs) in the 
1990s. 
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Mergers and acquisitions (M&A) have 
been an important part of corporate life for 
close to 200 years; in health care, the M&A 
experience has been largely felt over the past 
20 years. One major difference is that HCOs 
often form “affiliations” rather than par- 
ticipate in outright merger and acquisition 
activity—except in situations in which HCOs 
actually acquire physician practices. So while 
“M&A” is an appropriate reference for corpo- 
rate activity outside of health care, “MA&A” 
(Mergers, Affiliations and Acquisitions) may 
be a more appropriate term for the health care 
industry. Cost savings are often touted as the 
rationale for M&A activity, but that goal is 
seldom able to be documented and more typi- 
cal the result is to drive excess capacity from 
the system (e.g., an oversupply of hospital 
beds) and to secure market share. 

IDNs are still a prominent feature in many 
health care markets, often driven by new regula- 
tory requirements aimed at improved efficiency 
while emphasizing greater patient privacy and 
safety. While the most successful of IDNs have 
achieved a measure of structural and opera- 
tional integration, gains from the integration 
of clinical activities and from the consolida- 
tion of HCISs have been much more difficult. 
Many IDNs scaled back their original goals 
for integrating clinical activities and began to 
shed home care services, health plans and man- 
aged care entities. Most recently, the pendulum 
has swung back as IDNs acquire both physi- 
cian practices and hospitals while shifting their 
focus to becoming identified as an ACO, as 
reimbursement constraints and federal ACO 
initiatives strive to improve both the efficiency 
and effectiveness of HCOs. All these changes 
have tremendous implications for HCISs. 

The expertise gained from managing an 
inpatient-driven organization producing a 
relatively large amount of revenue from a 
relatively small set of events (e.g., a hospital) 
does not readily translate to the successful 
management of other organizational activi- 
ties that in many cases required many more 
events to produce a similar level of revenue 
(e.g., from outpatient clinics). In some cases, 
it was even a challenge to translate manage- 
ment processes from inpatient operations to 
outpatient clinics, or one hospital to another. 


Attempts to apply hospital management 
principles to ambulatory clinics have been 
challenged because inpatient-based hospitals 
generate a relatively small number of patient 
bills with high dollar amounts whereas ambu- 
latory clinics do just the opposite—generate a 
relatively large number of patient bills, each 
with a relatively small dollar amount. To date, 
it is fair to say that few IDNs have gained the 
degree of cost savings and efficiencies they 
had originally projected. The immense up- 
front costs of implementing (or integrating) 
the required HCISs have contributed to this 
limited success. Regardless of organizational 
structure, all health care organizations are 
striving toward greater information access 
and integration, including improved informa- 
tion linkages with physicians and patients. The 
“typical” IDN is a melding of diverse organi- 
zations, and the associated information sys- 
tems infrastructure is still far from integrated; 
rather, it remains in many cases an amalgam 
of heterogeneous systems, processes, and data 
stores (Blackstone & Fuhr Jr. 2003; Kastor 
2001; Shortell et al. 2000). 


16.2.5 Information Requirements 


The most important function of any HCIS is 
to present data to decision makers so that they 
can improve the quality and timeliness of the 
decisions they need to make. From a clinical 
perspective, the most important function of 
an HCIS is to present patient-specific data to 
care givers so that they can easily interpret the 
data for diagnostic and treatment planning 
purposes and support the necessary commu- 
nication among the many health care work- 
ers who cooperate in providing health services 
to patients. From an administrative perspec- 
tive, the most pressing information needs are 
those related to the daily operation and man- 
agement of the organization—bills must be 
generated accurately and rapidly, employees 
and vendors must be paid, supplies must be 
ordered, and so on. In addition, administra- 
tors need information to make short-term and 
long-term planning decisions. 

Since clinical system information require- 
ments are discussed in ® Chaps. 14, 17, 21, 
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and 24, we focus here on operational infor- 
mation requirements, and specifically on four 
broad categories: daily operations, planning, 
communication, and documentation and 
reporting. 
= Operational requirements. Health care 
workers—both care givers and adminis- 
trators— require detailed and up-to-date 
information to perform the daily tasks 
that keep a hospital, clinic, or physician 
practice running—the bread-and-butter 
tasks of the institution. These include 
not only clinical activities, but financial 
management, acquisition and manage- 
ment of supplies, posting charges, sending 
bills and receiving payments. Queries for 
operational purposes can include: In what 
room is patient John Smith? What drugs 
is he receiving? What tests are scheduled 
for Mr. Smith during his stay and after his 
discharge? What insurance coverage does 
he have? Is the staffing skill mix sufficient 
to handle the current volume and special 
needs of patients in Care Center 3 West? 
What are the names and telephone num- 
bers of patients who have appointments 
for tomorrow and need to be called for a 
reminder? What authorization is needed to 
perform an ultrasound procedure on Jane 
Blue under the terms of her health insur- 
ance coverage? Are the daily, monthly and 
annual financial reports accurate? Are the 
charges for services and supplies accurately 
collected and transferred to the billing sys- 
tem? Are bills sent out in a timely manner 
and remittances received and posted to the 
correct accounts? HCISs can support these 
operational requirements for information 
by organizing data for prompt and easy 
access. Because the HCO may have devel- 
oped product-line specialization within a 
particular facility (e.g., a diagnostic imag- 
ing center or women’s health center), how- 
ever, answering even a simple request may 
require accessing information stored in dif- 
ferent systems at several different facilities. 
= Planning requirements. Health profession- 
als also require information to make short- 
term and long-term decisions about patient 
care and organizational management. 
The importance of appropriate clinical 
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decision-making is obvious—we devote 
all of > Chaps. 3 and 24 to explaining 
methods to help clinicians select diagnos- 
tic tests, interpret test results, and choose 
treatments for their patients. The decisions 
made by administrators and managers are 
no less important to their choices concern- 
ing the acquisition and use of health care 
resources. In fact, clinicians and adminis- 
trators alike must choose wisely in their 
use of resources to provide high-quality 
care and excellent service at a competitive 
price. In addition, HCOs live in a highly 
regulated world and must report to local, 
state, federal and private entities on how 
they manage the care they provide and 
on the safety of that care. HCISs should 
help health care personnel (including 
auditors) to answer queries such as these: 
What are the organization’s clinical guide- 
lines for managing the care of patients 
with this condition? Have similar patients 
experienced better clinical outcomes with 
medical treatment or with surgical inter- 
vention? What are the financial and medi- 
cal implications of closing the maternity 
service? If we added six care managers to 
the outpatient-clinic staff, can we improve 
patient outcomes and reduce emergency 
admissions? Will the proposed contract 
to provide health services to Medicaid 
patients be profitable given the current 
cost structure and current utilization pat- 
terns? How many incidents occurred dur- 
ing the last month, including patient falls, 
medications that were given to the wrong 
patient or administered with the wrong 
dose? How often were supplies such as fire 
extinguishers or oxygen sources inspected? 
Often, the data necessary for planning 
and meeting regulation requirements are 
generated from many sources. HCISs can 
assist by aggregating, analyzing, and sum- 
marizing the information relevant to deci- 
sion-making and compliance. 

Communication requirements. Comm- 
unication and coordination of patient care 
and operations across multiple personnel, 
multiple business units, and far-flung geog- 
raphy are not possible without investment 
in an underlying technology infrastruc- 
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ture. For example, the routing of paper 
medical records, a cumbersome process 
even within a single hospital, is an impos- 
sibility for a regional network of providers 
trying to act in coordination. Similarly, it 
is neither timely nor cost effective to copy 
and distribute hard copy documents to 
all participants in a regionally distributed 
organization. An HCO’s technology infra- 
structure can enable information exchange 
via web-based access to shared databases 
and documents, collaborative platforms, 
electronic mail, document-management 
systems, and on-line calendaring systems, 
as well as providing and controlling access 
for authorized users at the place and time 
that information is required. 

Documentation and reporting requirements. 
The need to maintain records for future 
reference or analysis and reporting makes 
up the fourth category of informational 
requirements. Some requirements are 
internally imposed. For example, a com- 
plete record of each patient’s health sta- 
tus and treatment history is necessary to 
ensure continuity of care across multiple 
providers and over time. External require- 
ments create a large demand for data col- 
lection and record keeping in HCOs (as 
with mandated reporting of vaccination 
records to public health agencies). As dis- 
cussed in > Chap. 14, the medical record 
is a legal document. If necessary, the 
courts can refer to the record to determine 
whether a patient received proper care. 
Insurance companies require itemized 
billing statements, and medical records 
substantiate the clinical justification of 
services provided and the charges submit- 
ted to them. The Joint Commission (JC), 
which certifies the qualifications and per- 
formance of many health care organiza- 
tions, has specific requirements concerning 
the content and quality of medical records, 
as well as requirements for organization- 
wide information-management processes. 
Furthermore, to qualify for participation 
in the Medicare and Medicaid programs, 
the JC requires that hospitals follow 
standardized procedures for auditing the 
medical staff and monitoring the quality 


of patient care, and they must be able to 
show that HCOs meet the safety require- 
ments for infectious disease management, 
buildings, and equipment. Employer and 
consumer groups are also joining the list 
of external monitors. 


16.2.6 Process Integration 


To be truly effective, information systems 
must mesh smoothly with the people who 
use them and with the specific operational 
workflows of the organization. But process 
integration poses a significant challenge for 
HCOs and for the HCIS’s as well. Today’s 
health care-delivery models represent a radi- 
cal departure from historical models of care 
delivery. Changes in reimbursement and 
documentation requirements often lead, for 
example, to changes in the responsibilities and 
work patterns of physicians, nurses, and other 
care providers; the development of entirely 
new job categories (such as care managers 
who coordinate a patient’s care across facili- 
ties or between encounters); and the more 
active participation of patients in their own 
personal health management (@ Table 16.1). 
Process integration is further complicated in 
that component entities typically have evolved 
different operational policies and procedures, 
which can reflect different historical and lead- 
ership experiences from one office to another, 
or in the extreme example, from one floor to 
another within a single hospital. 

The most progressive HCOs are develop- 
ing new enterprise-wide processes for provid- 
ing easy and uniform access to health services, 
for deploying consistent clinical guidelines, 
and for coordinating and managing patient 
care across multiple care settings through- 
out the organization. Integrated information 
technologies are essential to supporting such 
enterprise-wide processes. Mechanisms for 
information management aimed at integrat- 
ing operations across entities must address 
not only the migration from legacy systems 
but also the migration from legacy work pro- 
cesses to new, more consistent and more stan- 
dardized policies and processes within and 
across entities. 
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The changing health care environment and its implications for an IDN’s core competencies. 


Columns | and 2 are used with permission with CSC. Column 3 is the authors’ addition 


Characteristics 


Goal of care 


Center of 
delivery system 


Focus of care 


Driver of care 
decisions 


Metric of system 
success 


Performance 
optimization 


Utilization 
controls 


Quality measures 


Physician role 


Patient role 


Old care model 


Manage sickness 


Hospital 


Episodic acute and 
chronic care 


Specialists 


Number of 
admissions 


Optimize individual 
provider performance 


Externally controlled 


Defined as inputs to 
system 


Autonomous and 
independent 


Passive receiver of 


Twentieth Century care 
model 


Manage wellness 


Primary-care providers/ 
ambulatory settings 


Population health, primary 
care 


Primary-care providers/ 
patients 


Number of enrollees 


Optimize system-wide 
performance 


Internally controlled 


Defined as patient 
satisfaction 


Member of care team; user 
of system-wide guidelines 
of care 


Active partner in care 


Twenty-first Century care model 


Prevent illness 


Physician offices, retail clinics 


Preventive care 


Patients with support of 
physicians, physician assistants, 
nurse practitioners 


Patient outcomes 


Optimize patient outcomes 


Value based care 


Value of care provided through 
measurement of patient 
outcomes 


Guiding physician assistants, 
providing oversight to nurse 
practitioners 


Primary driver of care 


care 


The introduction of new HCISs changes 
the workplace. Research has shown that in 
most cases the real value from an investment in 
information systems comes only when underly- 
ing work processes are changed to take advan- 
tage of the new information technology (see 
O Figs. 16.4 and 16.5). At times, these changes 
can be substantial. The implementation of a new 
system offers an opportunity to rethink and rede- 
fine existing work processes to take advantage of 
the new information-management capabilities, 
thereby reducing costs, increasing productivity, 
or improving service levels. For example, pro- 
viding electronic access to information that was 
previously accessible only on paper can shorten 
the overall time required to complete a multi- 
step activity by enabling conversion of serial 
processes (completed by multiple workers using 
the same record sequentially) to concurrent pro- 
cesses (completed by the workers accessing an 


electronic record simultaneously). More funda- 
mental business transformation is also possible 
with new technologies; for example, direct entry 
of medication orders by physicians, linked with 
a decision-support system, allows immediate 
checking for proper dosing and potential drug 
interactions, and the ability to recommend less 
expensive drug substitutes (Vogel 2003). 

Few health care organizations today have 
the time or resources to develop entirely new 
information systems or redesign processes on 
their own; therefore, most opt to purchase 
commercial software products and to use con- 
sultants to assist them in the implementation 
of industry “best practices”. Although these 
commercial systems allow some degree of 
custom tailoring, they also reflect an under- 
lying model of work processes that may have 
evolved through development in other health 
care organizations with different underlying 
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Order Entry Management Tasks Before Automation 


Physician Tasks 


Locate patient chart 
Review clinical results (lab, 
radiology, etc.) 


Jot notes from clinical results 
Examine patient (s) 
Locate patient chart 
Review record again 


Open “orders” tab 
Write orders 


If writing discharge order-write 
discharge prescriptions 


Nursing Tasks 


Locate patient chart 
Verify orders transcribed 
correctly 
“Note”new med orders correct 
on medication records 
Sign off on each set of orders 
Close chart 
“Unflag”chart indicating orders 
complete 
Put chart back in chart rack 
Carry out orders or assign to staff 


Clerk Tasks 


Locate patient chart 


Transcribe orders-to clerk 
kardex, to nurse kardex 


If clarification is needed, contact 
nurse 
Complete requisitions-lab, 
radiology, etc. 
Send requisitions to depts or put 
in a pick up area 
Send via fax or call depts for new 
orders- diet, respiratory, etc 


Locate medication records 


Sign orders ae 
“Flag” chart that new orders are to complete Enter new medication orders 
present Educate patient on new orders as Note status/completion of each 
Replace chart in rack needed item on order 
If wrote STAT orders, notify clerk Close chart 
and nurse “Flag” chart that orders have 
been transcribed 
13 steps 9 steps Put chart back in rack 


12 steps 


O Fig. 16.4 The process of managing the manual creation of orders requesting services on behalf of patients in a 
hospital involves numerous tasks performed not only by the ordering physician, but by nursing and clerical staff 


operational policies and procedures. To be 
successful, HCOs typically must adapt their 
own work processes to those embodied in the 
systems they are installing (For example, some 
commercial systems require care providers to 
discontinue and then reenter all orders when 
a patient is admitted to the hospital after 
being monitored in the emergency depart- 
ment). Furthermore, once the systems are 
installed and workflows have been adapted to 
them, they become part of the organization’s 
culture—and any subsequent change to the 
new system may be arduous because of these 
workflow considerations. Decision-makers 
should take great care when selecting and con- 
figuring a new system to support and enhance 
desired work processes. Such organizational 
workflow adaptation represents a significant 
challenge to the HCO and its systems plan- 
ners. Too often organizations are unable to 
realize the full potential return on their HCIS 
investments when they attempt to change 


the system to accommodate historical work 
flows, even before the new system is installed. 
Such management practices can significantly 
reduce much of the potential gains from the 
HCO’s IT investment. 

The Health Information Technology for 
Economic and Clinical Health Act of 2009, 
(HITECH) signed into law as Title XIII of 
American Recovery and Reinvestment Act of 
2009 (ARRA) economic stimulus bill, pro- 
vided almost $30 Billion as an incentive for 
hospitals and physician practices to acquire 
and implement Electronic Medical Records 
(EMRs). As a result, during subsequent 
years, over 90% of hospitals and physician 
practices implemented EMRs. Consolidation 
among the vendor community, particularly 
for inpatient products, occurred during the 
same period with companies like Epic, Cerner, 
Meditech and Allscripts dominating the mar- 
ket and at the same time enhancing their 
product suites to incorporate most of the 
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Benefits from reduced tasks for Provider 
require additional “complementary” 
management changes 


Change role of Provider 


Change job description of Clerk 


Physician Tasks 
Logon to system 
Select a patient 
Review Clinical Results 


Enter new order(s) 


Sign orders electronically 


Change job description of Nurse 


Change order intake workflow 
for all ancillaries 


Change order notification 
procedures for Nursing 


Change order notification 


procedures for Clerks 


Consider new role: Medical 
Assistant 


Develop and periodically test 
downtime procedures 


Ensure sufficient ordering 
hardware: desktops, laptops, 
tablets, etc. 


Ensure sufficient network 
bandwidth 


Develop 7x24 HELP desk 


Logoi Change clinical results REIN) 
notification process Ensure 7x24 Desktop Support 
Reduce Te steps Develop new reports on orders Ensure 7x24 Network Support 
To6 processing, completion Ser 
timeliness, etc. Ensure 7x24 Application Support 
Change to role-based Changes for 
authentication process for all IS Department 
providers 
Changes for Staff Roles 
and Tasks 


O Fig. 16.5 The implementation of an electronic phy- 
sician order entry system reduces the number of tasks 
that a physician needs to perform to enter an order, but 


HCIS functionality needed by hospitals. This 
has led to the demise of the “best of breed” 
strategy we mentioned earlier, as hospitals 
focused more on single vendor solutions. The 
market for ambulatory systems, on the other 
hand, has remained highly fragmented, with 
as many as 700 vendors competing for market 
share. 

To meet the continually evolving financial 
and quality documentation requirements of 
today’s health care environment, HCOs must 
continually evolve as well—and the analogy 
between changing an HCO and turning an 
aircraft carrier seems apt. Although an HCO’s 
business plans and information-systems strat- 
egies may be reasonable and necessary, chang- 
ing ingrained organizational behavior can 
be much more complex than changing the 
underlying information systems. Technology 
capabilities often exceed an HCO’s ability to 
use them effectively and efficiently. Successful 
process integration requires not only success- 
ful deployment of the technology but also sus- 


such a system will only be successful if other “comple- 
mentary” changes are made to both the workflow of 
staff and the responsibilities of the IS Department 


tained commitment of resources to use that 
technology well; dedicated leadership with 
the willingness to make difficult, sometimes 
unpopular decisions; education; and possi- 
bly new performance incentives to overcome 
cultural inertia and politics. Government 
incentives to stimulate HCOs toward the 
Meaningful Use of information technology, 
which emerged from the 2010 Health care 
Reform legislation are a recent example of 
attempts to bring process integration and data 
integration together. 


16.2.7 Security and Confidentiality 
Requirements 


HCISs are one of the most frequent targets 
of hackers. They seek to gain access to health 
care data, which contains more personalized 
data (and hence is considered more valu- 
able) than what is typically collected in other 
industries. Breeches of health care data stored 
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in both provider organizations and insur- 
ance companies have increased significantly 
over the past several years—often exceeding 
breeches of financial organization data found 
in organizations such as banks. In addition, 
the task of breaking into and stealing health 
care data has been superseded by ransom- 
ware--hackers blocking access to the data and 
demanding payment (often in bitcoin, which 
tends to be less traceable than cash payments). 
Using ransomware is much easier for hackers 
since data does not need to be moved and then 
offered for sale; payments come directly from 
the organization that was hacked in order to 
obtain the code needed to unblock access. 

The protection of health information 
from unwanted or inappropriate use is gov- 
erned not only by the trust of patients in their 
health providers but also by law. In accor- 
dance with the Health Insurance Portability 
and Accountability Act (HIPAA) of 1996 
(> Chap. 12), the Secretary of Health and 
Human Services recommended that “Congress 
enact national standards that provide funda- 
mental privacy rights for patients and define 
responsibilities for those who serve them.” 
This law and subsequent federal regulations 
now mandate standardized data transactions 
for sending data to payer organizations, the 
development and adherence to formal policies 
for securing and maintaining access to patient 
data, and under privacy provisions, prohibit 
disclosure of patient-identifiable information 
by most providers and health plans, except as 
authorized by the patient or explicitly permit- 
ted by legislation. Subsequent updates to the 
HIPAA regulations have strengthened consid- 
erably the requirements for security and pri- 
vacy protections and have also given patients 
the right to pursue actions against both orga- 
nizations and individuals when they feel that 
their personal information has been compro- 
mised. HIPAA also provides consumers with 
significant rights to be informed about how 
and by whom their health information will 
be used, and to inspect and sometimes amend 
their health information. Stiff criminal pen- 
alties including fines and possible imprison- 
ment are associated with noncompliance or 
the knowing misuse of patient-identifiable 
information. 


Computer systems can be designed to pro- 
vide security, but only people can promote the 
trust necessary to protect the confidentiality 
of patients’ clinical information. In fact, most 
breeches and inappropriate disclosures stem 
from human actions rather than from com- 
puter system failures. To achieve the goal of 
delivering coordinated and cost-effective care, 
clinicians need to access information on spe- 
cific patients from many different locations. 
However, it is difficult to predict in advance 
which clinicians will need access to which 
patient data and from which locations. And 
unfortunately, there is an inverse relationship 
between ease of access and security robust- 
ness. Therefore, an HCIS must strike a bal- 
ance between restricting information access, 
enabling health care workers to do their jobs, 
and ensuring the accountability of the users 
of patient information. 

With federal requirements to provide 
patients with access to their data through 
portals, the risk of inappropriate access only 
increases. In addition, more and more devices 
(e.g., ventilators, monitoring machines, 
IV pumps) are connected to HCO’s inter- 
nal communications networks, making the 
Internet of Things (IoT)-- a reality in health 
care. To build trust with its patients and meet 
HIPAA requirements, an HCO should adopt 
a three-pronged approach to securing infor- 
mation. First, the HCO needs to designate 
a security officer (and typically a privacy 
officer as well) and develop uniform security 
and confidentiality policies, including speci- 
fication of sanctions, and to enforce these 
policies rigorously. Second, the HCO needs 
to train employees so they understand the 
appropriate uses of patient-identifiable infor- 
mation and the consequences of violations. 
Third, the HCO must use electronic tools 
such as intrusion detection, access controls 
and audit trails not only to discourage misuse 
of information, but also to inform employees 
and patients that people who access confi- 
dential information without proper authori- 
zation or a “need to know”, can be tracked 
and will be held accountable. 
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16.2.8 The Impact of Health Care 
Information Systems 


On average, health care workers in adminis- 
trative departments spend about three-fourths 
of their time handling information; work- 
ers in nursing units spent about one-fourth 
of their time on these tasks. The fact is that 
information management in health care orga- 
nizations, even with significant computeriza- 
tion, is a costly activity in terms of both time 
and money. The collection, storage, retrieval, 
analysis, and dissemination of the clinical and 
administrative information necessary to sup- 
port the organization’s daily operations, to 
meet external and internal requirements for 
documentation and reporting, and to support 
short-term and strategic planning remain 
important and time-consuming aspects of the 
jobs of health-care workers. 

Today, the justifications for implement- 
ing HCISs include cost reduction, productiv- 
ity enhancements, and quality and service 
improvement, as well as strategic consider- 
ations related to competitive advantage, patient 
expectations, and regulatory compliance: 
= Cost reduction. Much of the historical 

impetus for implementing HCISs was their 

potential to reduce the costs of informa- 
tion management in hospitals and other 
facilities largely by reducing the number of 
employees. HCOs continue to make tacti- 
cal investments in information systems to 
streamline administrative processes and 
departmental workflows. Primary benefits 
that may offset some information-systems 
costs include reductions in labor require- 
ments, reduced waste (e.g., dated surgical 
supplies that are ordered but unused or 
food trays that are delivered to the wrong 
destination and therefore are wasted), and 
more efficient management of supplies 
and other inventories. Large savings can 
be gained through efficient scheduling of 
expensive resources such as surgical suites 
and imaging equipment. In addition, 

HCISs can help to eliminate inadvertent 

ordering of duplicate tests and procedures. 

Once significant patient data are available 

online, information systems can reduce the 
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costs of storing, retrieving, and transport- 
ing charts in the medical-records depart- 
ment. 

Productivity enhancements. A second area 
of benefit from an HCIS comes in the form 
of improved productivity of clinicians and 
other staff. With continuing (and at times 
increasing) constraints on reimbursements, 
HCOs are continually faced with the chal- 
lenge of doing more with less. Providing 
information systems support to staff can 
in many cases enable them to manage a 
larger variety of tasks and data than would 
otherwise be possible using strictly manual 
processes. Interestingly, in some cases hos- 
pital investments in an HCIS support the 
productivity improvement of staff that are 
not employed by the hospital, namely the 
physicians, and can even extend to payers 
by lowering their costs. One of the major 
challenges with introducing a new HCIS is 
that the productivity of users may actually 
decrease in the initial months of the imple- 
mentation. With complex clinical applica- 
tions, learning new ways of working can 
lead to high levels of user dissatisfaction in 
addition to lowered productivity. 

Quality and service improvement. As HCISs 
have broadened in scope to encompass 
support for clinical processes, the ability 
to improve the quality of care has become 
an additional benefit. Qualitative benefits 
of HCISs include improved accuracy and 
completeness of documentation, reduc- 
tions in the time clinicians spend docu- 
menting (and associated increases in time 
spent with patients), fewer drug errors 
and quicker response to adverse events, 
and improved provider-to-provider com- 
munication. Through telemedicine and 
remote linkages (see > Chap. 20), HCOs 
can expand their geographical reach and 
improve delivery of specialist care to rural 
and outlying areas. Once patient data are 
converted from a purely transaction for- 
mat to a format better suited for analytic 
work, the use of clinical decision-support 
systems in conjunction with a clinically 
focused HCIS can produce impressive ben- 
efits, namely improving the quality of care 
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while reducing costs (> Chap. 24) (Bates 
and Gawande 2003; James and Savitz 
2011; Goldzweig, et al. 2009; Himmelstein 
and Woolhandler 2010; McCullough et al. 
2010; Castaneda et al. 2015). 

Competitive advantage. Information tech- 
nologies must be deployed appropriately 
and effectively; however, with respect to 
HCISs, the question is no longer whether 
to invest, but rather how much and what 
to buy. Although some organizations still 
attempt to cost justify all information- 
systems investments, many HCOs have 
recognized that HCISs are “enabling 
technologies” which means that the value 
comes not from the system itself but from 
what it “enables” the organization to do 
differently and better (Vogel 2003). If 
workflow and processes are not changed 
to take advantage of the technology, the 
value of the investment will largely go 
unrealized. And it is not just the ratio of 
financial benefits to costs that is important; 
access to clinical information is necessary 
not only to carry out patient manage- 
ment, but also to attract and retain the loy- 
alty of physicians who care for (and thus 
control much of the HCO’s access to) the 
patients. The long-term benefits of clini- 
cal systems include the ability to influence 
clinical practices by reducing large unnec- 
essary variations in medical practices, to 
improve patient outcomes, and to reduce 
costs—although these costs might be more 
broadly economic and societal than related 
to specific reductions for the hospital itself 
(Leatherman et al. 2003; James and Savitz 
2011; Gottlieb et al. 2015). Physicians ulti- 
mately control the great majority of the 
resource-utilization decisions in health 
care through their choices in prescrib- 
ing drugs, ordering diagnostic tests, and 
referring patients for specialty care. Thus, 
providing physicians with access to infor- 
mation on “best practices” based on the 
latest available clinical evidence, as well 
as giving them other clinical and financial 
data to make appropriate decisions, is an 
essential HCIS capability. 

Regulatory compliance. Health care is among 
the most heavily regulated industries in our 


economy. State and federal regulatory agen- 
cies perform a variety of oversight activities, 
and these require increasingly sophisticated 
and responsive HCISs to provide the nec- 
essary reports. For example, the Food and 
Drug Administration now mandates the use 
of barcodes on all drugs. Similarly, HIPAA 
rules specify how personal health informa- 
tion must be managed as well as the required 
content and format for certain electronic 
data transactions for those HCOs that 
exchange data electronically. OSHA, the 
Department of Labor, the Environmental 
Protection Agency, the Nuclear Regulatory 
Commission, and a host of other agencies 
all have an interest in seeing that the health 
care provided by HCOs is consistent with 
standards of safety and fairness. 

= Consumerism and patient expectations. 
As medical breakthroughs promise bet- 
ter outcomes for patients, the patients as 
consumers are becoming more involved. 
Federal mandates have driven the cre- 
ation of Patient Portals, essentially dedi- 
cated web sites for patients to access and 
in some cases contribute to their own 
medical records. While there are limita- 
tions to patient involvement (e.g., to date 
the number of patients accessing and 
contributing through portals has not met 
expectations, and older, sicker patients 
often have limited abilities to access and/ 
or understand their own data)—it is likely 
that trends toward more patient involve- 
ment will continue. 


16.2.9 Managing Information 
Systems in a Changing 
Health Care Environment 


Despite the importance of integrated informa- 
tion systems, implementation of HCISs has 
proved to be a daunting task, often requiring 
a multiyear capital investment of hundreds of 
millions of dollars and forcing fundamental 
changes in the types and ways that health care 
professionals perform their jobs. To achieve 
the potential benefits, HCOs must plan care- 
fully and invest wisely. The grand challenge 
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for an HCO is to implement an HCIS that is 
sufficiently flexible and adaptable to meet the 
changing needs of the organization. Given 
the rapidly changing environment and the 
multiyear effort involved, people must be 
careful to avoid implementing a system that 
is obsolete functionally or technologically 
before it becomes operational. Success in 
implementing an HCIS entails consistent and 
courageous handling of numerous technical, 
organizational, and political challenges. 


16.2.10 Changing Technologies 


As we discussed in » Chap. 6, dramatic 
changes in computing and networking technol- 
ogies are continuing to occur. These advances 
are important in that they allow quicker and 
easier information access, less expensive com- 
putational power and data storage, greater 
flexibility, and other performance advantages. 
A major challenge for many HCOs is how 
to decide whether to support a best of breed 
strategy, with its requirement either to upgrade 
individual systems and interfaces to newer 
products or to migrate from their patchwork 
of legacy systems to a more integrated systems 
environment. Such migration requires integra- 
tion and selective replacement of diverse sys- 
tems that are often implemented with closed 
or nonstandard technologies and medical 
vocabularies. Unfortunately, the trade-off 
between migrating from best of breed to more 
integrated systems is that vendors offering 
more integrated approaches seldom match the 
functionality of historical best of breed envi- 
ronments, although with each new software 
version the gap lessens. As a result, best of 
breed strategies are becoming less of an option 
since commercial vendors are broadening 
and deepening the scope of their application 
suites to minimize the challenges of building 
and managing interfaces and to protect their 
market share. In a sense, it is the information 
content of the systems and the ability to imple- 
ment them that is much more important than 
the underlying technology—as long as the 
data are accessible, the choice of specific tech- 
nology is less critical. 
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16.2.11 Changing Culture 


In the current health care environment, physi- 
cians are confronted with significant obstacles 
to the practice of medicine as they have his- 
torically performed it. Physicians’ long his- 
tory of entrepreneurial practice is changing 
as more and more physicians work directly as 
employees of HCOs and fewer and fewer own 
their own practices. As a result, physicians 
face significant adjustments as they are con- 
fronted by pressures to practice in accordance 
with institutional standards aimed at reduc- 
ing variation in care, and to focus on the costs 
of care regardless of whether those costs are 
borne by HCOs or by third party payers. They 
are expected to assume responsibility not sim- 
ply for healing the sick, but for the wellness 
of people who come to them not as patients 
but as members of health plans and health 
maintenance organizations. In addition, they 
must often work as members of collaborative 
patient-care teams. The average patient length 
of stay in a hospital is decreasing; at the same 
time, the complexity of the care provided 
both during and after discharge is increasing. 
The time allotted for an individual patient 
visit in an ambulatory setting is decreasing 
as individual clinicians face economic incen- 
tives to increase the number of patients for 
whom they care each day. Some HCOs, aided 
by federal funding incentives, are now insti- 
tuting pay-for performance incentives to 
reward desired work practices. At the same 
time, it is well known that the amount of 
knowledge about disease diagnosis and treat- 
ment increases significantly each year, with 
whole new areas of medicine being added 
from major breakthroughs in areas such as 
genomic and imaging research. To cope with 
the increasing workload, greater complexity 
of care, extraordinary amounts of new medi- 
cal knowledge, new skills requirements, and 
the wider availability of medical knowledge 
to consumers through the Internet, both cli- 
nicians and health executives must become 
more effective information managers, and the 
supporting HCISs must meet their ever more 
complex workflow and information require- 
ments. As the health care culture and the roles 
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of clinicians and health executives continue 
to change, HCOs must constantly reevaluate 
the capabilities of information technology to 
ensure that the implemented systems continue 
to match user requirements and expectations. 


16.2.12 Changing Processes 


Developing a new vision of how health care 
will be delivered and managed, designing pro- 
cesses and implementing supporting infor- 
mation systems are all critical to the success 
of evolving HCOs. Changes in workflow 
processes affect the jobs that people do, the 
skills required to do those jobs, and the fun- 
damental ways in which they relate to one 
another. For example, models of care man- 
agement that cross organizational or specialty 
boundaries encourage interdisciplinary care 
teams to work in harmony to promote health 
as well as treat illness. Although information 
systems are not the foremost consideration for 
people who are redesigning processes, a poor 
information-systems implementation can 
institutionalize bad processes. 

HCOs periodically undertake various pro- 
cess redesign initiatives (following models such 
as Six Sigma, Kaizen or LEAN), and these 
initiatives can lead to fundamental transforma- 
tions of the enterprise. Indeed, work process 
redesign is essential if information systems are 
to become truly valuable “enablers” in HCOs. 
Too often, however, the lack of a clear under- 
standing of existing organizational dynamics 
leads to a misalignment of incentives—a signif- 
icant barrier to change—or to the assumption 
that simply installing a new computer system 
will be sufficient to generate value. Moreover, 
HCOs, like many organizations, are collections 
of individuals who often have natural fears 
about and resistance to change. Even under 
the best of circumstances, there are limits to 
the amount of change that any organization 
can absorb. The magnitude of work required 
to plan and manage organizational change is 
often underestimated or ignored. The han- 
dling of people and process issues has emerged 
as one of the most critical success factors for 
HCOs as they implement new work methods 
and new and upgraded information systems. 


16.2.13 Changing Sources of Data 


Historically the sources of data for electronic 
systems in health care were relatively limited. 
Transactions resulting from activities like 
lab tests, radiological studies, medications 
prescribed and given, and specific clinical 
activities such as inpatient encounters with 
physicians or outpatient visits were recorded 
and charges for these transactions were then 
posted to a patient’s account. Once the inpa- 
tient stay or the ambulatory visit was com- 
pleted, the charges were collected into a bill 
which was then sent to the patient’s third- 
party payer (e.g., Medicare, Medicaid or a 
private insurance company) or to the patient 
directly. The amount of actual data processed 
from these transactions is relatively small. 
Even a complex series of clinical encounters 
between a patient and a physician can be cap- 
tured in as little as 4-500 kb of data, or the 
equivalent of about 200 pages of text. 

In recent years, however, there has been 
an explosion of data, both in terms of volume 
and of complexity. Data sources that create 
images (e.g., radiological studies, pathology 
slides) can generate data that easily grows. 
While an individual projection image can 
range from 8 to 32 MB in size, and a digital 
mammography image can range from 8 to 
50 MB, CT and MRI scans generate relatively 
small individual images but complete studies 
can include literally thousands of images and 
grow to GBs in size. As imaging technology 
continues to develop with both more detailed 
individual images and greater number of 
images per study, the amount of data being 
collected and stored can be expected to grow 
exponentially. 

In addition to the growth in data from 
these more traditional modalities, new data 
sources are being added to data already being 
collected and stored. As a result, HCISs must 
continue to evolve to incorporate access to 
this data. Since the completion of the initial 
Human Genome Sequencing project in 2003, 
for example, data from genomic studies has 
become an increasingly important source for 
clinicians. Sequencing machines today can 
produce a million times more data than what 
was collected in 2004, and more sequences 


Management of Information in Health Care Organizations 


can be run in an hour than were produced in 
the previous decade. Not only is this a huge 
amount of data, it isin a format quite differ- 
ent from historical clinical data—hence the 
increase in complexity as well as volume. 

We are also seeing an explosion of data 
from another source: wearables. These are 
devices that individuals wear on their arm or 
wrist, collecting data on various aspects of a 
person’s physical health. It has been predicted 
that over the next few years, there could be 
as many as 6-700 million wearable devices 
worldwide. 

One example of a device measuring 
health data is the continuous glucose monitor 
(CGM). CGMs can capture close to 300 data 
points each day or approximately 110,000 
data points annually. With close to 30 mil- 
lion people diagnosed with diabetes in the US, 
even if only half of those use a CGM, there 
could be close to 3.3E12 data points collected 
annually. 

New sources of data are being created 
almost constantly. As electronic sensors and 
connectivity are increasingly embedded in 
devices, many of them designed to moni- 
tor and manage the health of individuals, 
the amount of data collected will continue 
to grow. With the Internet as the backbone, 
there is almost no limit to what can be mea- 
sured and connected electronically resulting 
in a true Internet of Things (IoT). 

The challenge of an HCIS is to enable cli- 
nicians to access these new sources of data, 
to understand their importance and relevance, 
and to then use them to enhance their diag- 
nostic and therapeutic capabilities, leading 
hopefully to better outcomes for their patients. 


16.2.14 Management 


and Governance 


O Figure 16.6 illustrates the information- 
technology environment of an HCO com- 
posed of two hospitals, an owned physician 
practice, affiliated nursing homes and hospice, 
and several for-profit service organizations. 
Even this relatively simple environment pres- 
ents significant challenges for the management 
and governance of information systems. For 
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example, to what extent will the information 
management function be controlled centrally 
versus decentralized to the individual oper- 
ating units and departments? How should 
limited resources be allocated between new 
investment in strategic projects (such as office- 
based data access for physicians) and the often 
critical operational needs of individual enti- 
ties (e.g., replacement of an obsolete labora- 
tory information system)? Academic medical 
centers with distinct research and educational 
needs raise additional issues for managing 
information across operationally independent 
and politically powerful constituencies. 

Trade-offs between functional and inte- 
gration requirements, and associated conten- 
tion between users and information-systems 
departments, will tend to diminish over time 
with the development and widespread adop- 
tion of technology standards and common 
clinical-data models and vocabulary. On the 
other hand, an organization’s information- 
systems “wants” and “needs” will always 
outstrip its ability to deliver these services. 
Political battles will persist, as HCOs and 
their component operating units wrestle with 
the age-old issues of how to distribute scarce 
resources among competing, similarly worthy 
projects. 

A formal HCIS governance structure with 
representation from all major constituents 
provides a critical forum for direction setting, 
prioritization, and resource allocation across 
an HCO. Leadership by respected clinical 
peers has proved a critical success factor for 
clinical systems planning, implementation, 
and acceptance. In addition, the creation of 
an Information Systems Advisory or Steering 
Committee composed of the leaders of the 
various constituencies within the HCO, can be 
a valuable exercise if the process engages the 
organization’s clinical, financial, and admin- 
istrative leadership and users and results in 
their gaining not only a clear understanding 
of the highest-priority information technol- 
ogy investment requirements but also pro- 
vides a sense of accountability and ownership 
over the HCISs and their various functions 
(Vogel 2006). This supports one of the prin- 
ciples of information technology governance: 
how an institution makes IT investment deci- 
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O Fig. 16.6 An example of an information systems 
environment for a small integrated delivery network 
(IDN). Even this relatively simply IDN has a complex 


sions is often more important than what spe- 
cific decisions are made (Weill and Ross 2004; 
Haddad et al. 2018). Because of the dynamic 
nature of both health care business strategies 
and the supporting technologies, many HCOs 
have seen the timeframes of their strategic 
information-management thinking shrink 
from 5 years to three, and then be changed yet 
again through annual updates. 


16.3 Functions and Components 
of a Health Care Information 
System 


Carefully designed computer-based informa- 
tion systems can increase the effectiveness and 
productivity of health professionals, improve 
the quality and reduce the costs of health 
services, and improve levels of service and of 
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information management challenges for the organiza- 
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patient satisfaction. As described in > Sect. 
16.1, the HCISs support a variety of functions, 
ranging from the delivery and management 
of patient care to the administration of the 
HCO. From a functional perspective, HCISs 
typically consist of components that support 
five distinct purposes: (1) patient management 
and billing, (2) ancillary services, (3) care 
delivery and clinical documentation, (4) clini- 
cal decision support, (5) institutional financial 
and resource management (@ Fig. 16.7). 


16.3.1 Patient Management 


and Billing 


Systems that support patient management 
functions perform the basic HCO operations 
related to patients, such as registration, sched- 
uling, admission, discharge, transfer among 
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O Fig. 16.7 The evolution of computing systems in 
hospitals has followed a path that parallels the evolution 
of computing systems in general. From mainframes to 
minicomputers to desktops, and more recently mobile 
devices, the purpose and function of systems in hospi- 


locations, and billing. Historically within 
HCOs, maintenance of the hospital census 
and a patient billing system were the first tasks 
to be automated—largely because a patient’s 
location determined not only the daily room/ 
bed charges (since an ICU bed was more 
expensive than a regular medical/surgical bed) 
but where medications were to be delivered, 
and where clinical results were to be posted. 
Today, virtually all hospitals and ambula- 
tory centers and many physician offices use a 
computer-based master patient index (MPI) 
to store patient-identification information 
that is acquired during the patient-registration 
process, and link to simple encounter-level 
information such as dates and locations where 
services were provided. The MPI can also be 
integrated within the registration module of 
an ambulatory care or physician-practice sys- 
tem or even elevated to an enterprise master 
patient index (EMPI) across several facilities. 
Within the hospital setting, the census is main- 
tained by the admission-discharge-transfer 
(ADT) module, which is updated whenever a 
patient is admitted to the hospital, discharged 
from the hospital, or transferred from one bed 
to another. 

Registration and patient census data serve 
as a reference base for the financial programs 
that perform billing functions. When an HCIS 
is extended to other patient-care settings— 
e.g., to the laboratory, pharmacy, and other 
ancillary departments—patient-management 
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tals has followed a path from financial systems to 
departmental systems to systems designed specifically to 
enhance the productivity and raise the quality of health 
care services 


systems provide a common reference base for 
the basic patient demographic data needed 
by these systems. Without access to the cen- 
tralized database of patient financial, demo- 
graphic, registration and location data, these 
subsystems would have to maintain duplicate 
patient records. In addition, the transmission 
of registration data can trigger other activi- 
ties, such as notification of hospital house- 
keeping when a bed becomes available after a 
patient is discharged. The billing function in 
these systems serves as a collection point for 
all the chargeable patient activity that occurs 
in a facility, including room/bed charges, 
ancillary service charges, and supplies used 
during a patient’s stay. 

Scheduling in a health care organization is 
complicated because patient load and resource 
utilization can vary by day, week, or season or 
even through the course of a single day sim- 
ply due to chance, emergencies that arise, or 
to patterns of patient and physician behavior. 
Effective resource management requires that 
the appropriate resources be on hand to meet 
such fluctuations in demand. At the same 
time, resources should not remain unneces- 
sarily idle since that would result in their inef- 
ficient use. The most sophisticated scheduling 
systems have been developed for the operat- 
ing rooms and radiology departments, where 
scheduling challenges include matching the 
patient not only with the providers but also 
with special equipment and support staff 


16 


566 L. H. Vogel and W. C. Reed 


such as technicians. Patient-tracking applica- 
tions monitor patient movement in multistep 
processes; for example, they can monitor, esti- 
mate, and manage patient wait times in the 
emergency department. 

Within a multi-facility HCO, the basic tasks 
of patient management are compounded by the 
need to manage patient care across multiple set- 
tings, some of which may be supported by inde- 
pendent information systems. Is the Patricia 
C. Brown who was admitted last month to 
Mountainside Hospital the same Patsy Brown 
who is registering for her appointment at the 
Seaview Clinic? Integrated delivery networks 
ensure unique patient identification either 
through conversion to common registration 
systems or, more frequently, through implemen- 
tation of an EMPI that links patient identifiers 
and data from multiple registration systems. 


16.3.2 Ancillary Services 


Ancillary departmental systems support 
the information needs of individual clinical 
departments within an HCO. From a sys- 
tems perspective, those areas most commonly 
automated are the laboratory, pharmacy, radi- 
ology, blood-bank, operating rooms, and med- 
ical-records departments, but can also include 
specialized systems to support cardiology (for 
EKGs), respiratory therapy and social work. 
Such systems serve a dual purpose within an 
HCO. First, ancillary systems perform many 
dedicated tasks required for specific depart- 
mental operations. Such tasks include gener- 
ating specimen-collection lists and capturing 
results from automated laboratory instru- 
ments in the clinical laboratory, printing 
medication labels and managing inventory in 
the pharmacy, and scheduling examinations 
and supporting the transcription of image 
interpretations in the radiology department. 
In addition, information technology coupled 
with robotics can have a dramatic impact on 
the operation of an HCO’s ancillary depart- 
ments, particularly in pharmacies (to sort and 
fill medication carts) and in clinical laborato- 
ries (where in some cases the only remaining 
manual task is the collection of the specimen 
and its transport to the laboratory’s robotic 


system). Second, the ancillary systems contrib- 
ute major data components to online patient 
records, including laboratory-test results and 
pathology reports, medication profiles, digi- 
tal images (see > Chap. 22), records of blood 
orders and usage, and various transcribed 
reports including history and physical exami- 
nations, operating room and radiology reports. 
HCOs that consolidate ancillary functions 
outside hospitals to gain economies of scale— 
for example, creating outpatient diagnostic 
imaging centers and reference laboratories— 
increase the complexity of integrated patient 
management, financial, and billing processes. 


16.3.3 Care Delivery and Clinical 
Documentation 


Electronic medical record (EMR) systems that 
support care delivery and clinical documenta- 
tion are discussed at length in » Chap. 14. 
Although comprehensive EMRs are the ulti- 
mate goal of most HCOs, many organizations 
today are still building more basic clinical- 
management capabilities. Automated order 
entry and results reporting are two important 
functions provided by the clinical components 
of an HCIS. Health professionals can use the 
HCIS to communicate with ancillary depart- 
ments electronically, eliminating the easily 
misplaced paper slips or the transcription 
errors often associated with translating hand- 
written notes into typed requisitions, thus 
minimizing delays in conveying orders. The 
information then is available online, where it 
is easily accessible by any authorized health 
professional that needs to review a patient’s 
medication profile or previous laboratory-test 
results. Ancillary departmental data represent 
an important subset of a patient’s clinical 
record. A comprehensive clinical record, how- 
ever, also includes various data that clinicians 
have collected by questioning and observing 
the patient, including the history and physi- 
cal report, progress notes and problem lists. 
In the hospital, an HCIS can help health per- 
sonnel perform an initial assessment when a 
patient is admitted to a unit, maintain patient- 
specific care plans, chart vital signs, maintain 
medication-administration records, record 
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diagnostic and therapeutic information, docu- 
ment patient and family teaching, and plan for 
discharge (also see > Chap. 17). Many orga- 
nizations have developed diagnosis-specific 
clinical pathways that identify clinical goals, 
interventions, and expected outcomes by time 
period. Using clinical pathways, case man- 
agers or care providers can document actual 
versus expected outcomes and are alerted 
to intervene when a significant unexpected 
event occurs. More hospitals are now imple- 
menting systems to support what are called 
closed loop medication management systems 
in which every task from the initial order for 
medication to its administration to the patient 
is recorded in an HCIS—one outcome of 
increased attention to patient safety issues. 
With the shift toward delivering more care 
in outpatient settings, clinical systems have 
become more common in ambulatory clinics 
and physician practices. Numerous vendors 
have introduced software compatible with 
smart phones, tablets, and other mobile devices 
designed specifically for physicians in ambula- 
tory settings, so that they can access appropriate 
information even as they move from one exam 
room to another. Such systems allow clinicians 
to record problems and diagnoses, symptoms 
and physical examinations, medical and social 
history, review of systems, functional status, 
active and past prescriptions, provide access 
to therapeutic and medication guidelines, etc. 
The most successful systems are integrated 
with a practice management system, providing 
additional support for physician workflow and 
typical clinic functions, for example, by docu- 
menting telephone follow-up calls or printing 
prescriptions. In addition, specialized clinical 
information systems have been developed to 
meet the specific requirements of intensive-care 
units (see > Chap. 21), long-term care facilities, 
home-health organizations, and specialized 
departments such as cardiology and oncology. 


16.3.4 Clinical Decision Support 


Clinical decision-support systems (» Chap. 
24) directly assist clinical personnel in data 
interpretation and decision-making. Once the 
basic clinical components of an HCIS are well 
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developed, clinical decision-support systems 
can use the information stored there to moni- 
tor patients and issue alerts, to make diagnos- 
tic suggestions, to provide limited therapeutic 
guidance, and to provide information on med- 
ication costs. These capabilities are particu- 
larly useful when they are integrated with 
other information-management functions. For 
example, a useful adjunct to computer-based 
physician order-entry (CPOE) is a decision- 
support program that alerts physicians to 
patient food or drug allergies; helps physi- 
cians to calculate patient-specific drug-dosing 
regimens; performs advanced order logic, such 
as recommending an order for prophylactic 
antibiotics before certain surgical procedures; 
automatically discontinues drugs when appro- 
priate or prompts the physician to reorder 
them; suggests more cost-effective drugs with 
the same therapeutic effect; or activates and 
displays applicable clinical-practice guidelines 
(see > Chap. 24). Clinical-event monitors inte- 
grated with results-reporting applications can 
alert clinicians to abnormal results and drug 
interactions by electronic mail, text message 
or page. In the outpatient setting, these event 
monitors may produce reminders to provide 
preventive services such as screening mammo- 
grams and routine immunizations. The same 
event monitors might trigger access to the 
HCO’s approved formulary, displaying infor- 
mation that includes costs, indications, contra- 
indications, approved clinical guidelines, and 
relevant online medical literature (Kaushal 
et al. 2003). Unfortunately, an overabundance 
of alerts has also caused care givers to experi- 
ence alarm fatigue resulting in them ignoring 
potentially critical warnings. 


16.3.5 Financial and Resource 
Management 


Financial and administrative systems assist 
with the traditional business functions of an 
HCO, including management of the payroll, 
human resources, general ledger, accounts pay- 
able, and materials purchasing and inventory. 
Most of these data-processing tasks are well 
structured and have been historically labor 
intensive and repetitious—ideal opportunities 
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for substitution with computers. Furthermore, 
with the exception of patient-billing functions, 
the basic financial tasks of an HCO do not dif- 
fer substantially from those of organizations 
in other industries. Not surprisingly, financial 
and administrative applications have typically 
been among the first systems to be standard- 
ized and centralized in IDNs. 

Conceptually, the tasks of creating a 
patient bill and tracking payments are straight- 
forward, and financial transactions such as 
claims submission and electronic funds trans- 
fer have been standardized to allow electronic 
data interchange (EDI) among providers 
and payers. In operation, however, patient 
accounting requirements are complicated by 
the myriad and oft-changing reimbursement 
requirements of government and third-party 
payers. These requirements vary substantially 
by payer, by insurance plan, by type of facil- 
ity where service was provided, and often by 
state. As the burden of financial risk for care 
has shifted from third party payers to provid- 
ers (through per diem or diagnosis-based reim- 
bursements), these systems have become even 
more critical to the operation of a successful 
HCO. As another example, managed care con- 
tracts add even more complexity, necessitating 
processes and information systems to check a 
patient’s health-plan enrollment and eligibility 
for services, to manage referrals and preautho- 
rization for care, to price claims based on nego- 
tiated contracts, and to create documentation 
required to substantiate the services provided. 

As HCOs increasingly go “at risk” for 
delivery of health services by negotiating per 
diem, diagnosis-based, bundled and capitated 
payments, their incentives need to focus not 
only on reducing the cost per unit service 
but also on maintaining the health of mem- 
bers while using health resources effectively 
and efficiently. Similarly, the HCO’s scope of 
accountability broadens from a relatively small 
population of sick patients to a much larger 
population of plan members (such as might be 
found in ACOs), most of whom are still well. 

Provider-profiling systems support uti- 
lization management by tracking each pro- 
vider’s resource utilization (costs of drugs 
prescribed, diagnostic tests and procedures 
ordered, and so on) compared with severity- 


adjusted outcomes of that provider’s patients 
such as their rate of hospital readmission and 
mortality by diagnosis. Such systems are also 
being used by government bodies and con- 
sumer advocate organizations as they publi- 
cize their findings, often through the Internet. 
Contract-management systems have capa- 
bilities for estimating the costs and payments 
associated with potential managed care con- 
tracts and comparing actual with expected 
payments based on the terms of the contracts. 
More advanced managed-care information 
systems handle patient triage and medical 
management functions, helping the HCOs to 
direct patients to appropriate health services 
and to proactively manage the care of chroni- 
cally ill and high-risk patients. Health plans, 
and IDNs that incorporate a health plan, 
also must support payer and insurance func- 
tions such as claims administration, premium 
billing, marketing, and member services. 

As HCOs continue to seek ways to reduce 
their expenses, they are introducing both more 
automation and more integration into “back 
office” systems. Enterprise Resource Planning 
(ERP) systems integrate human resources 
functions with payroll, accounts receivable 
and payment systems and overall supply man- 
agement. ERP systems can be effectively inte- 
grated with other components of the HCIS, 
e.g., nurse staffing systems and surgical sched- 
uling systems, in an attempt to use informa- 
tion to optimize operations and resource 
utilization. Implementing these systems can 
bring challenges similar to clinical systems, 
albeit with different types of employees. Long 
term employees often develop their own styles 
of managing information based on historical 
patterns and preferences, much of which must 
change when an ERP system is implemented. 


16.4 Forces That Will Shape 
the Future of Health Care 
Information Systems 
Management 


As we have discussed throughout this chap- 
ter, the changing landscape of the health-care 
industry and the strategic and operational 
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requirements of HCOs and IDNs have accel- 
erated the acquisition and implementation 
of HCISs. The acquisition and implementa- 
tion of Electronic Medical Records (EMRs) 
have been a particular focus, especially with 
the availability of federal stimulus fund- 
ing through the provisions of the Health 
Information Technology for Economic and 
Clinical Health (HITECH) Act under the 
American Recovery and Reinvestment Act 
of 2009 (ARRA). Although there are many 
obstacles to implementation and acceptance 
of smoothly functioning, fully integrated 
HCISs, few people today would debate the 
critical role that information technology plays 
in an HCO’s success or in an IDN’s efforts at 
clinical and operational management. 

We have emphasized the dynamic nature of 
today’s health care environment and the asso- 
ciated implications for HCISs. A host of new 
requirements loom that will challenge today’s 
available solutions. We anticipate additional 
expectations and requirements associated with 
the changing organizational landscape, techno- 
logical advances, and broader societal changes. 


16.4.1 Changing Organizational 


Landscape 


Although the concepts underlying HCOs and 
IDNs are no longer new, the underlying organi- 
zational forms and business strategies of these 
complex organizations continue to evolve. 
The success of individual HCOs varies widely. 
Some, serving target patient populations such 
as those with heart disease or cancer or age- 
defined groups such as children, have been 
relatively more successful financially than those 
attempting to serve patients across a wide range 
of illnesses or those attempting to combine 
diverse missions of clinical care, teaching and 
research. IDNs, on the other hand, have by and 
large failed to achieve the operational improve- 
ments and cost reductions they were designed 
to deliver. It is possible that entirely new forms 
of HCOs and IDNs will emerge in the coming 
years. Key to understanding the magnitude of 
the information systems challenge for IDNs in 
particular is recognizing the extraordinary pace 
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of change—IDNs reorganize, merge, uncouple, 
acquire, sell off, and strategically align services 
and organizational units in a matter of weeks. 
While information technology is itself chang- 
ing with accelerating frequency, today’s state- 
of-the-art systems (computer systems and 
people processes) typically require months or 
years to build and refine. 

The continuing pressures of the market 
place, including the downward trend in reim- 
bursements and the evolving efforts by the 
Federal government to change the payment 
structure from Procedure-based Payments 
to Value-Based Payments has led to merger 
and acquisition activity across the health care 
industry. Although the outright merger or 
acquisition of one HCO by another HCO is 
still relatively infrequent, HCO’s and IDNs 
have increasingly targeted physician prac- 
tices for acquisition in order both to control 
the flow of patients to inpatient beds and to 
increase the standardization of care proto- 
cols across physician groups. In addition, 
in situations in which an outright merger or 
acquisition seems likely to be unsuccessful, 
affiliations between HCOs are an increasingly 
common strategy to ensure better coordina- 
tion across historically competing boundaries. 

All too frequently, business deals are cut with 
insufficient regard to the cost and time required 
to create the supporting information infrastruc- 
ture. For IDNs even in the best of circumstances, 
the cultural and organizational challenges of 
linking diverse users and care-delivery settings 
will tax their ability to change their information 
systems environments quickly enough. These 
issues will increase in acuity as operational bud- 
gets continue to shrink—today’s HCOs and 
IDNs are spending significant portions of their 
capital budgets on information-systems invest- 
ments. In turn, these new investments translate 
into increased annual operating costs (costs of 
regular system upgrades, maintenance, user 
support, and staffing). Still most health care 
organizations devote at most 3-4% of their total 
revenues to their information systems operating 
budgets; in other information-intensive indus- 
tries (e.g., financial services, air transportation), 
the percentage of operating budgets devoted to 
information technology investment can be three 
to four times higher (Weil 2001). 
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16.4.2 Changes Within the HCIS 
Organization 


Information technology in the health care 
industry became a major focus for invest- 
ment starting in the late 1980s and early 
1990s, as developments in both hardware and 
software technology led both to increased 
affordability and enhanced functionality. 
We noted earlier the evolution from main- 
frames to minicomputers to networked PCs, 
as well as the transformation of application 
suites from enterprise-wide financial and bill- 
ing systems to more departmentally-focused 
information technology investment deci- 
sions in provider organizations. This evolu- 
tion led to an ever-more complex computing 
environment encompassing a diversity of 
hardware and software and networking capa- 
bilities to enable everything to work together. 
Organizational capabilities to manage these 
changes evolved as well, from a small group 
of technicians managing a mainframe enyi- 
ronment to a growing number not only of 
technicians, but groups of dedicated applica- 
tion support staff, networking engineers and 
specialists in desktop support. Information 
Services Departments began to replace “Data 
Processing Departments” and their reach and 
importance to the organization. Leaders were 
no longer department directors but “Chief 
Information Officers” (CIOs), charged with 
responsibilities as diverse as keeping the sys- 
tems running on a daily basis, overseeing the 
seemingly endless upgrades to both hardware 
and software, aligning information technol- 
ogy investments with business strategy, and 
keeping the personal data of patients secure 
from unauthorized access. As “CIOs” they 
were expected to lead the procurement of 
new hardware and software in conjunction 
with other “C-Suite” executives or physician 
leaders of the various departments seeking to 
introduce computers into their work environ- 
ments. Most often these were systems sup- 
porting administrative and financial processes 
or ancillary departments like the laboratory 
or the radiology or pharmacy departments. 
However, with the 1991 publication of The 
Institute of Medicines The Computer-Based 
Patient Record: An Essential Technology for 


Health Care (Institute of Medicine 1991) an 
entire new era of information technology began 
to emerge with the goal of capturing all clinical 
activity electronically, giving rise to Electronic 
Medical Records. Followed a little more than 
a decade later by an executive order from then 
President George Bush with the goal of cre- 
ating an electronic medical record within a 
decade, and then the financial incentives in the 
2009 HITECH legislation to induce hospitals 
and physician offices to implement EMRs, the 
complexity of the health care computing envi- 
ronment increased dramatically. In addition, 
with the new emphasis on clinical computing, 
physicians as the primary users of clinical sys- 
tems became much more invested not only in 
what functionality was being purchased, but 
who was leading the purchasing process. 
During the 1990s and into the 2000s, an 
entire generation—including clinicians--be- 
came more sophisticated users of computer 
technology. As clinical systems came to be 
many organizations’ largest information 
technology investment, clinicians sought to 
exercise more leadership in system selection 
and implementation. The position of Chief 
Medical Information (or Informatics) Officer 
(CMIO) was created, at times reporting to 
the Chief Information Officer (CIO), but at 
other times reporting outside of the historical 
information technology chain of command. 
Additional recognition for computer-savvy 
clinicians came with the creation in 2011 
of a formal board certification in Clinical 
Informatics. In addition, with the increasing 
importance of clinical computing capabili- 
ties, provider organizations began looking to 
recognize formally the physician informat- 
ics leaders by appointing them to the CIO 
position. More traditional CIOs with back- 
grounds in business or computing became in 
some organizations relegated to operational 
roles—keeping the data center and the net- 
work running on the 24-hour basis that was 
essential in a health care environment. Some 
organizations created separate roles for 
Chief Technology Officers (CTOs) or Chief 
Innovation Officers that had little or no opera- 
tional responsibilities. Additionally, as HCO’s 
began to recognize the impact of social media 
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on their marketing efforts, the role of Chief 
Digital Officer also started to emerge. 

These organizational changes can be 
expected to continue as responses to not 
only more complex clinical and computing 
environments, but also to the overwhelm- 
ing importance of clinical activities overall 
to the success of any provider organization. 
The roles and responsibilities of these various 
information CxO positions can be expected to 
continue to evolve as both the overall orga- 
nizational landscape and the capabilities of 
computer systems evolve as well. 


16.4.3 Technological Changes 
Affecting Health Care 
Organizations 


Future changes in technology are hard to pre- 
dict. For example, although we have heard 
for over two decades that seamless voice-to- 
text systems are 5 years away from practi- 
cal use, with the introduction of controlled 
vocabularies in areas such as radiology and 
pathology, we are beginning to see commer- 
cial products that can “understand” dictated 
speech and represent it as text that can then 
be structured for further analysis. Both the 
emergence of increasingly powerful processor 
and memory chips, and the decreasing cost of 
storage media will continue to be a factor in 
future health-systems design—although the 
tsunami of data coming from imaging modal- 
ities and from genomic medicine sequencing 
and analysis will continue to be a significant 
challenge (see > Chaps. 2, 22, and 26). The 
ever expanding availability of Internet access, 
the increasing integration of voice, video, and 
data, and platforms which permit the inte- 
gration of these various technologies as well 
as the availability of ever smaller platforms 
like tablets and smart phones, will challenge 
HCOs and IDNs—and HCISs--to have com- 
munications capacity not only within their 
traditional domain but also to an extended 
enterprise that may include patients’ homes, 
schools, and workplaces. The design of mod- 
ern software based on the replicability of 
code, coding standards such as XML, C++, 
C#, Python, and Java and frameworks such 
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as Microsoft’s .NET should eventually yield 
more flexible information technology systems. 

One of the most significant technologi- 
cal challenges facing HCOs and IDNs today 
occurs because, while much of the health care 
delivered today continues to be within the 
four walls of a physician’s office or a hospi- 
tal, new venues such as retail clinics are grow- 
ing in number and geographical distribution. 
Further, as the population ages, patients may 
seek care from a multitude of sources, includ- 
ing primary and specialty practices, multiple 
hospital visits (and even visits to multiple hos- 
pitals) and may increasingly be monitored in 
their homes. Health care information technol- 
ogies (and clinical systems in particular) have 
focused historically on what happens within a 
physician’s office or within a hospital, and not 
across physicians’ office nor between the physi- 
cians’ office and the hospital nor in the home 
of the patient. As technology developments 
increasingly permit care to be provided outside 
physician offices and hospitals, HCISs will be 
challenged to incorporate if not the data, then 
at least the access to these new sources of data. 

In general, EMR products on the market 
today started with a single purpose: to auto- 
mate the workflow of clinicians within a par- 
ticular organizational setting. Among other 
features, EMRs focus on making data from 
previous encounters or activities easier to 
access and assuring that orders for tests and 
x-rays have the correct information, or that 
the next shift knows what went on previously. 
Despite visible successes and failures for all 
manner of products, EMRs in general can 
facilitate the automation of a complex work- 
flow—of automating intra-organizational 
clinical processes as well as those that cross 
organizational boundaries. 

Architectures that focus on what happens 
within organizational boundaries do not eas- 
ily facilitate access to data across organiza- 
tional boundaries. This is the challenge of 
interoperability. Recognizing that patients 
often receive care in a variety of organiza- 
tional settings—hospitals, physicians offices, 
rehabilitation facilities, pharmacies, retail 
clinics, their homes, etc.—the challenge is 
to extend the internal workflow beyond the 
boundaries of individual organizations so 
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that data is available across a continuum of 
care. Interoperability then is not so much 
about what happens within an organization 
(although there can be challenges here as 
well), but what happens across organizational 
boundaries. 

An intra-organizational architecture 
focuses on facilitating real time communica- 
tions among providers, on optimizing the 
process of collecting data at the point of care, 
and on ensuring that clinical tasks are car- 
ried out in an appropriate sequence. An inter- 
organizational architecture needs to minimize 
the duplicate collection of data in different 
care settings, to facilitate quick searches of 
relevant data from a variety of (often external) 
sources, and to rank data in terms of relevance 
to a particular clinical question. Transitioning 
from intra- to inter-organizational data shar- 
ing is a significant technological challenge. 
While Health Information Exchanges (HIEs) 
and Health Record Banks (HRBs) are at the 
forefront of this transition (see ® Chap. 15), 
over time we can expect that the architectures 
of clinical systems that currently focus on 
what happens within an organization will need 
to transition to facilitate what happens across 
organizations. 

Security and confidentiality concerns 
will likely increase as the emergence of a 
networked society profoundly changes our 
thinking about the nature of health care deliv- 
ery. It is no longer only physicians or their 
orders that generate clinical data; increas- 
ingly patients are generating their own data 
either through wearables or through devices 
that measure their health in their own homes. 
Health services are still primarily delivered 
locally—we seldom leave our local commu- 
nities to receive health care except under the 
most dire circumstances. In the future, provid- 
ers and even patients will have access to health 
care experts that are dispersed over state, 
national, and even international boundaries. 
Distributed health care capabilities will need 
to support the implementation of collabora- 
tive models that could include virtual house 
calls and routine remote monitoring via tele- 
medicine linkages (see > Chap. 20). 


16.4.4 Societal Change 


At the beginning of the twenty-first cen- 
tury, clinicians find themselves spending less 
time with each patient and more time with 
administrative and regulatory—and often 
data entry--tasks. This decrease in clinician- 
patient contact has contributed to declining 
patient and provider satisfaction with elec- 
tronic care-delivery systems. At the same time, 
empowered health consumers interested in 
self-help and unconventional approaches have 
access to more health information than ever 
before. These factors are changing the inter- 
play among physicians, care teams, patients, 
and external (regulatory and financial) forces. 
The changing model of care, coupled with 
changing economic incentives to deliver mea- 
surable high-quality care at lower cost, places 
a greater focus on wellness and preventa- 
tive and lifelong care. Although we might 
agree that aligning economic incentives with 
wellness is a good thing, this alignment also 
implies a shift in responsibility from care giv- 
ers to patients. 

Like the health care environment, the 
technological context of our lives is also 
changing. The Internet has already dramati- 
cally changed our approaches to information 
access and system design. Concurrent with the 
development of new standards of informa- 
tion display and exchange is a push led by the 
entertainment industry (and others) to deliver 
broadband multimedia into our homes. Such 
connectivity has the potential to change care 
models more than any other factor we can 
imagine by bringing fast, interactive, and mul- 
timedia capabilities to the household level. 
Finally, vast amounts of information can 
now be stored efficiently remotely, e.g., in the 
cloud, and on movable media such as memory 
sticks, which brings more flexibility as well 
as more risk, as such devices are both more 
convenient and more susceptible to being lost 
or misplaced. With the increase in the avail- 
ability of consumer-oriented health informa- 
tion, including, for example, video segments 
that show the appearance and sounds of 
normal and abnormal conditions or dem- 
onstrate common procedures for home care 


Management of Information in Health Care Organizations 


and health maintenance, we can expect even 
more changes in the traditional doctor/patient 
relationship. 

With societal factors such as the focus on 
the outcomes of care pushing our HCOs and 
IDNs to change, cost constraints continuing 
to loom large, and the likely availability of 
extensive computing and communications 
capacity in the home, in the work place, and 
in the schools, HCOs and health providers are 
increasingly challenged to rethink the basic 
operating assumptions about how to deliver 
care. The traditional approach has been facility 
and physician centric—patients usually come 
to the hospital or to the physician’s office at a 
time convenient for the hospital or the physi- 
cian. The HCO and IDN of the twenty-first 
century may have to be truly “patient centric”, 
operating within a health care delivery system 
without walls, where routine health manage- 
ment is conducted in nontraditional settings, 
such as homes and workplaces with increasing 
volumes and complexity of the data required 
to provide care. 


(e) Suggested Readings 

Christensen, C., Grossman, J., & Hwang, J. 
(2009). The innovators prescription. 
New York: McGraw-Hill This book builds on 
the author’s previous work on disruptive inno- 
vation with specific applications to the health 
care industry. Christensen uses terms such as 
“precision medicine” to describe the advent of 
more personalized approaches to medical 
diagnosis and treatment and builds on his 
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ing the use of electronic medical records and 
information technology in general for sharing 
knowledge. 

Ong, K. (2011). Medical informatics: An execu- 
tive primer (2nd ed.). Chicago: Health care 
and Management Information Systems 
Society An excellent overview of the chal- 
lenges facing information technology applica- 
tions in hospitals, physicians’ offices, and in 
the homes of patients. Also includes a discus- 
sion of recent federal legislation intended to 
stimulate the use of electronic medical records 
and the challenges of measuring how to deter- 
mine whether such investments are in fact 
“meaningfully used”. 

Porter, M., & Teisberg, E. (2006). Redefining 
health care: Creating value-based competition 
on results. Cambridge, MA: Harvard Business 
School Press The authors begin with a very 
straightforward assumption, which is that 
“the way to transform health care is to realign 
competition with value for patients” (p. 4), 
and proceed with an exhaustive discussion of 
the historical failures at reforming the health 
care system, the challenges inherent in 
physician-provider organization relationships, 
and how the only likely solution set to the cur- 
rent high cost of health care is to focus our 
efforts on what brings value to the patients. 

Vogel, L. (2018). Who knew? Inside the complex- 
ity of American health care. New York: 
Taylor-Francis Identifies the major factors 
that combine to make health care the most 
complex industry in the American economy. 


analysis of disruptive business models in other (@ Questions for Discussion 


industries to analyze both the underlying 
problems and challenges of our health care 
delivery system. 

Institute of Medicine, The Computer-Based 
Patient Record: An Essential Technology for 
Health Care, (1991) Washington, DC: The 
National Academies Press.  https://doi. 
org/10.17226/5306. This book was one of first 
major attempts to argue for the use of com- 
puter technology to improve patient outcomes. 

Lee, T., & Mongan, J. (2009). Chaos and organi- 
zation in health care. Cambridge, MA: The 
MIT Press The authors describe the current 
health care situation as one simply of “chaos”. 
Among the solutions they propose are increas- 


1. Briefly explain the differences among 
an HCO’ operational, planning, 
communications, and documentary 
requirements for information. Give 
two examples in each category. 
Choose one of these categories and 
discuss similarities and differences 
in the environments of an integrated 
delivery network, a community-based 
ambulatory-care clinic, and a 
specialty-care physician’s office. 
Describe the implied differences in 
these units’ information requirements. 

2. Describe three situations in which the 
separation of clinical and administrative 
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information could lead to inadequate 
patient care, loss of revenue, or inappro- 
priate administrative decisions. Identify 
and discuss the challenges and limita- 
tions of two methods for improving data 
integration. 

3. Describe three situations in which lack 
of integration of information systems 
with clinicians’ workflow can lead to 
inadequate patient care, reduced physi- 
cian productivity, or poor patient satis- 
faction with an HCO’s services. Identify 
and discuss the challenges and limita- 
tions of two methods for improving pro- 
cess integration. 

4. Describe the trade-off between func- 
tionality and integration. Discuss three 
strategies currently used by HCOs to 
minimize this tradeoff. 

5. Assume that you are the chief informa- 
tion officer of multi-facility HCO. You 
have just been charged with planning a 
new clinical HCIS to support a large 
tertiary care medical center, two smaller 
community hospitals, a nursing home, 
and a 40-physician group practice. Each 
organization currently operates its own 
set of integrated and standalone tech- 
nologies and applications. What techni- 
cal and organizational factors must you 
consider? What are the three largest 
challenges you will face over the next 
24 months? 

6. How do you think the implementation 
of clinical HCISs will affect the quality 
of relationships between patients and 
providers? Discuss at least three 
potential positive and three potential 
negative effects. What steps would you 
take to maximize the positive value of 
these systems? 
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(e) Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What is patient-centered care? How 
does it differ from traditional, clinician- 
centric care? 

= What are the information management 
challenges in patient-centered care? 

= What are the roles of electronic health 
records and other informatics applica- 
tions in supporting patient-centered 
care? 

= What forces and developments have led 
to the emergence of patient-centered 
care systems? 

= What collaborative processes are 
required to design patient-centered care 
systems and the electronic health 
records to support such care? 

= How is current informatics research 
advancing progress toward collabora- 
tive, interdisciplinary, patient-centered 
care? 


17.1 Information Management 
in Patient-Centered Care 


Patient care is the focus of many clinical dis- 
ciplines—medicine, nursing, pharmacy, nutri- 
tion, therapies such as respiratory, physical, 
and occupational, and others. Although the 
work of the various disciplines sometimes 
overlaps, each has its own primary focus, 
emphasis, and methods of care delivery. Bach 
discipline’s work is complex in itself, and col- 
laboration among disciplines, an essential 
component of patient-centered care, adds 
another level of complexity. In all disciplines, 
the quality of clinical decisions depends in 
part on the quality of information available 
to the decision-maker. The systems that man- 
age information for patient-centered care are 
therefore critical tools. Their fitness for the 
job varies, and the systems enhance or detract 
from patient-centered care accordingly. This 
chapter describes information management 
issues in patient-centered care, the emergence 
of patient-centered care systems in relation 
to these issues, the interdisciplinary collabo- 


ration required to develop patient-centered 
care systems, and current research. In so 
doing, it will demonstrate the necessity of a 
patient-centered perspective in the design of 
electronic health records (EHRs) and other 
patient-care systems. 

As described later in this chapter, reports of 
the National Academies, Federal Government 
mandates, and a variety of social forces have 
called for transformation in the organization, 
delivery, financing, and quality of health care. 
The demand is for evidence-based, cost-effec- 
tive, patient-centered care. Informatics is seen 
as essential to the provision, monitoring, and 
improvement of such care. 


From Patient Care 
to Patient-Centered Care 


17.1.1 


Patient-centered care is a collaborative, 
interdisciplinary process focused on the care 
recipient in the context of the family, signifi- 
cant others, and community. A distinguishing 
feature of patient-centered care is the patient’s 
active collaboration in shared decision- 
making, as contrasted to traditional clinician- 
centered care where the clinician holds the 
preponderance of power and authority. 
Patient-centered care empowers patients to 
actively participate in care by presenting treat- 
ment options that are consistent with patient 
values and preferences and in a format or 
context that is understandable and action- 
able (Krist and Woolf 2011; Payton et al. 
2011). Typically, patient care includes the ser- 
vices of physicians, nurses, and members of 
other health disciplines according to patient 
needs: physical, occupational, and respiratory 
therapists; nutritionists; psychologists; social 
workers; and many others. Each of these dis- 
ciplines brings specialized perspectives and 
expertise. Specific cognitive processes and 
therapeutic techniques vary by discipline, but 
all disciplines share certain commonalities in 
the provision of care. 

In its simplest terms, the process of 
patient-centered care begins with collecting 
data and assessing the patient’s current status 
and expressed concerns in comparison to cri- 
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teria or expectations of normality. Through 
cognitive processes specific to the discipline, 
diagnostic labels are applied, therapeutic goals 
are identified with timelines for evaluation, 
and therapeutic interventions are selected and 
implemented. The patient participates, as he 
or she is able, in determining therapeutic goals 
and selecting personally acceptable interven- 
tions from the options and their potential 
consequences as described by the clinician. At 
specified intervals, the patient is reassessed, the 
effectiveness of care is evaluated, and thera- 
peutic goals and interventions are continued or 
adjusted as needed. If the reassessment shows 
that the patient no longer needs care, services 
are terminated. This process was illustrated for 
nursing in 1975 (Goodwin & Edwards 1975) 
and was updated and made more general in 
1984 (Ozbolt et al. 1984). The flowchart repro- 
duced in @ Fig. 17.1 could apply equally well 
to other patient-care disciplines. 

Although this linear flowchart helps to 
explain some aspects of the process of care, it 
is, like the solar-system model of the atom, a 
gross simplification. Frequently, for example, 
in the process of collecting data for an initial 
patient assessment, the nurse may recognize 
(diagnose) that the patient is anxious about 
her health condition. Simultaneously with 
continuing the data collection, the nurse sets 
a therapeutic goal that the patient’s anxiety 
will be reduced to a level that increases the 
patient’s comfort and ability to participate in 
care. The nurse selects and implements thera- 
peutic actions of modulating the tone of voice, 
limiting environmental stimuli, maintaining 
eye contact, using gentle touch, asking about 
the patient’s concerns, and providing infor- 
mation. All the while, the nurse observes the 
effects on the patient’s anxiety and adjusts his 
behavior accordingly. Thus, the complete care 
process can occur in a microcosm while one 
step of the care process—data collection—is 
underway. This simultaneous, nonlinear qual- 
ity of patient care poses challenges to infor- 
matics in the support of patient care and the 
capture of clinical data. 

Each caregiver’s simultaneous atten- 
tion to multiple aspects of the patient is not 
the only complicating factor. Just as atoms 
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become molecules by sharing electrons, the 
care provided by each discipline becomes part 
of a complex molecule of interdisciplinary, 
patient-centered care. Caregivers and devel- 
opers of informatics applications to support 
care must recognize that true patient-centered 
care is as different from the separate contribu- 
tions of the various disciplines as an organic 
molecule is from the elements that go into it. 
The contributions of the various disciplines 
are not merely additive; as a therapeutic force 
acting upon and with the patient, the work of 
each discipline is transformed by its interac- 
tion with the patient and the other disciplines 
in the larger unity of patient-centered care. 


17.1.2 Patient-Centered Care 
in Action 


A 75-year-old woman with osteoarthritis, 
high blood pressure, and urinary incontinence 
is receiving care from a physician, a home-care 
nurse, a nutritionist, a physical therapist, and 
an occupational therapist. From a clinician- 
centered, additive perspective, each discipline 
could be said to perform the following func- 
tions: 

1. Physician: diagnose diseases, prescribe 
appropriate medications, authorize other 
care services 

2. Nurse: assess patient’s understanding of 
her condition and treatment and her self- 
care abilities and practices; assess patient’s 
concerns, values, and preferences regard- 
ing the management of her health; teach 
and counsel as needed; help patient to per- 
form exercises at home; identify and help 
patient overcome barriers to self-care and 
participation in her recovery plan; report 
findings to physician and other members 
of care team 

3. Nutritionist: assess patient’s nutritional 
status and eating patterns; prescribe and 
teach appropriate diet to control blood 
pressure and build physical strength 

4. Physical therapist: prescribe and teach 
appropriate exercises to improve strength 
and flexibility and to enhance cardiovascu- 
lar health, within limitations of arthritis 
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O Fig.17.1 The provision 
of nursing care is an 
iterative process that 
consists of steps to collect 
and analyze data, to plan 


and implement 
interventions, and to 
evaluate the results of 
interventions (Source: 
Adapted with permission 
from Ozbolt et al. (1985). © 
Springer Nature) 
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5. Occupational therapist: assess abilities and 
limitations for performing activities of 
daily living; prescribe exercises to improve 
strength and flexibility of hands and arms; 
teach adaptive techniques and provide 
assistive devices as needed 


In a collaborative, interdisciplinary, patient- 
centered practice, the nurse discovers that the 
patient is not taking walks each day as pre- 
scribed because her urinary incontinence is 
exacerbated by the diuretic prescribed to treat 
hypertension, and the patient is embarrassed 


Is nursing 
terminated? 


to go out. The nurse reports this to the physi- 
cian and the other clinicians so that they can 
understand why the patient is not carrying 
out the prescribed regime. The physician then 
changes the strategy for treating hypertension 
while initiating treatment for urinary inconti- 
nence. The nurse helps the patient to under- 
stand the interaction of the various treatment 
regimes, provides practical advice and assis- 
tance in dealing with incontinence, and helps 
the patient to find personally acceptable ways 
to follow the prescribed treatments. The nutri- 
tionist works with the patient on the timing 
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of meals and fluid intake so that the patient 
can exercise and sleep with less risk of urinary 
incontinence. The physical and occupational 
therapists adjust their recommendations to 
accommodate the patient’s personal needs 
and preferences while moving toward the ther- 
apeutic goals. Finally, the patient, rather than 
being assailed with the sometimes conflicting 
demands of multiple clinicians, is supported 
by an ensemble of services that meets shared 
therapeutic goals in ways consistent with her 
preferences and values. 

This kind of patient-centered collabora- 
tion requires exquisite communication and 
feedback. The potential for information sys- 
tems to support or sabotage patient-centered 
care is obvious. 


17.1.3 Coordination 
of Patient-Centered Care 


When patients receive services from mul- 
tiple clinicians, patient-centeredness requires 
coordinating those services. Coordination 
includes seeing that patients receive all the 
services they need in logical sequence with- 
out scheduling conflicts and ensuring that 
each clinician communicates as needed with 
the others. Sometimes, a case manager or care 
coordinator is designated to do this coordi- 
nation. In other situations, a physician or a 
nurse assumes the role by default. Sometimes, 
coordination is left to chance, and both the 
processes and the outcomes of care are put 
at risk. In recognition of this, the Institute of 
Medicine designated coordination of care as 
1 of 14 priorities for national action to trans- 
form health care quality (Adams & Corrigan 
2003). The Health Information Technology for 
Economic and Clinical Health Act (HITECH 
Programs 2009!) calls for patients to have a 
medical home, a primary care practice that 
will maintain a comprehensive problem list 
to make fully informed decisions in coordi- 
nating their care. In addition, Accountable 


1 > http://healthit.hhs.gov/portal/server.pt/commu- 
nity/healthit_hhs_gov__hitech_programs/1487 
(Accessed: 4/26/13). 
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Care Organizations seek to integrate provid- 
ers and services to generate value for defined 
populations. Substantial federal investment in 
health information technology (HIT) through 
the HITECH and 21st Century Cures Acts 
dramatically increased the adoption and use 
of EHRs (Adler-Milstein & Jha 2017). While 
HIT spending increased markedly, care coor- 
dination technologies have not been a focus. 
HIT tools dedicated to coordination pro- 
cesses could improve care of complex patients 
through increased access to data, facilitated 
communication, timely shared decisions, 
and greater engagement of patients and their 
families as partners in their care plans. Well- 
designed information systems with patient 
facing-technologies (e.g., personal health 
records and patient portals) enable care coor- 
dination as they ensure that patients and 
providers have immediate access to accurate 
health information at home and across care 
settings (Ahern et al. 2011; Collins, Bavuso 
et al. 2017; Collins, Klinkenberg-Ramirez 
et al. 2017; Collins, Rozemblum et al. 2017). 


17.1.4 Patient-Centered Care 
Across Multiple Patients 


Delivering and managing interdisciplinary 
patient-centered care for an individual is 
challenging enough, but patient care has yet 
another level of complexity. Each clinician is 
responsible for the care of multiple patients. In 
planning and executing the work of patient- 
centered care, each professional must consider 
the competing demands of all the patients for 
whom she is responsible, as well as the exigen- 
cies of all the other professionals involved in 
each patient’s care. Thus, the nurse on a post- 
operative unit must plan for scheduled treat- 
ments for each of her patients to occur near 
the optimal time for that patient. She must 
take into account that several patients may 
require treatments at nearly the same time and 
that some of them may be receiving other ser- 
vices, such as imaging or physician’s visits, at 
the time when it might be most convenient for 
the nurse to administer the treatment. When 
unexpected needs arise, as they often do—an 
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emergency, an unscheduled patient, observa- 
tions that could signal an incipient complica- 
tion—the nurse must set priorities, organize, 
and delegate to be sure that at least the critical 
needs are met. Similarly, the physician must 
balance the needs of various patients who 
may be widely dispersed throughout an insti- 
tution. Decision-support systems have the 
potential to provide important assistance for 
both the care of individual patients and the 
organization of the clinician’s workload. 


17.1.5 Integrating Indirect-Care 


Activities 


Finally, clinicians not only deliver services 
to patients, with all the planning, document- 
ing, collaborating, referring, and consult- 
ing attendant on direct care; they are also 
responsible for indirect-care activities, such 
as teaching and supervising students, attend- 
ing staff meetings, participating in continuing 
education, and serving on committees. Each 
clinician’s plan of work must allow for both 
the direct-care and the indirect-care activities. 
Because the clinicians work in concert, these 
plans must be coordinated. 

In summary, patient care is an extremely 
complex undertaking with multiple levels. To 
achieve patient-centered care, each clinician’s 
contributions to the care of every patient 
must take into account not only that patient’s 
values, preferences, and concerns, but also the 
ensemble of contributions of all clinicians 
involved in the patient’s care and the inter- 
actions among them, and this entire suite of 
care must be coordinated to optimize effec- 
tiveness and efficiency. These very complex 
considerations are multiplied by the number 
of patients for whom each clinician is respon- 
sible. Patient care is further complicated by 
the indirect-care activities that caregivers must 
intersperse among the direct-care responsi- 
bilities and coordinate with other caregivers. 
The resulting cognitive workload frequently 
overwhelms human capacity. Systems that 
effectively assist clinicians to manage, process, 
and communicate the data, information, and 


knowledge essential to patient-centered care 
are critical to the quality and safety of that 
care. 


17.1.6 Information to Support 


Patient-Centered Care 


As complex as patient care is, the essential 

information for direct, patient-centered care is 

defined in the answers to the following ques- 

tions: 

= What are the patient’s needs, concerns, 

preferences, and values? 

Who is involved in the care of the patient? 

What information does each clinician 

require to make decisions in his or her pro- 

fessional domain? 

= From where, when, and in what form does 
the information come? 

= What information does each clinician gen- 
erate? Where, when, and in what form is it 
needed? 


The framework described by Zielstorff, 
Hudgings, and Grobe (1993) provides a use- 
ful heuristic for understanding the varied 
types of information required to answer each 
of these questions. As listed in @ Table 17.1, 
this framework delineates three information 
categories: (1) patient-specific data about a 
particular patient acquired from a variety 
of data sources; (2) agency-specific data rel- 
evant to the specific organization under whose 
auspices the health care is provided; and (3) 
domain information and knowledge specific 
to the health care disciplines. 

The framework further identifies four types 
of information processes that information 
systems may apply to each of the three infor- 
mation categories. Data acquisition entails the 
methods by which data become available to 
the information system. It may include data 
entry by the care provider, patient, or family, 
or acquisition from a medical device or from 
another computer-based system. Data storage 
includes the methods, programs, and struc- 
tures used to organize data for subsequent 
use. Standardized coding and classification 
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Framework for design characteristics of a patient-care information system with examples of 
patient-specific data, agency-specific data, and domain information and knowledge for patient care 


Types of System processes 
data Acquiring Storing Transforming 
Domain- Downloading relevant Maintaining Linking related 
specific scientific or clinical information in literature or published 
literature or practice electronic findings; updating 
guidelines journals or files, guidelines based on 
searchable by key research 
words 
Agency- Scanning, Maintaining Editing and updating 
specific downloading, or information in information; linking 
keying in agency electronic related information in 
policies and directories, files, response to queries; 
procedures; keying in and databases analyzing information 
personnel, financial, 
and administrative 
records 
Patient- Point-of-care entry of Moving patient Combining relevant 
specific data about patient data into a data on a single 


assessment, diagnoses, 
treatments planned 
and delivered, 
therapeutic goals, and 
patient outcomes 


current electronic 


record or an 
aggregate data 
repository 


patient into a cue for 
action ina 
decision-support 
system; performing 
statistical analyses on 


Presenting 


Displaying relevant 
literature or guidelines 
in response to queries 


Displaying on request 
continuously current 
policies and 
procedures; sharing 
relevant policies and 
procedures in response 
to queries; generating 
management reports 


Displaying reminders, 
alerts, probable 
diagnoses, or suggested 
treatments; displaying 
vital signs graphically; 
displaying statistical 
results 


data from many 
patients 


Source: Framework adapted with permission from Next Generation Nursing Information Systems, 1993, 
American Nurses Association, Washington, DC. Reused with permission. 


systems useful in representing patient-cen- 
tered care concepts are discussed in greater 
detail in > Chap. 7. Data transformation (or 
data processing) comprises the methods by 
which stored data or information are acted 
on according to the needs of the end-user— 
for example, calculation of a pressure ulcer 
risk-assessment score at admission or calcula- 
tion of critically ill patients’ acute physiology 
and chronic health evaluation (APACHE IIT) 
scores. © Figure 17.2 illustrates the transfor- 
mation (abstraction, summarization, aggrega- 
tion) of patient-specific data for multiple uses. 
Presentation encompasses the forms in which 
information is delivered to the end-user after 
processing. 

Transformed patient-specific data can be 
presented in a variety of ways. Numeric data 
may be best presented in chart or graph form 
to allow the user to examine trends, whereas 


the compilation of potential diagnoses gen- 
erated from patient-assessment data is better 
presented in an alphanumeric-list. Different 
types of agency-specific data lend themselves 
to a variety of presentation formats. Common 
among all, however, is the need for presenta- 
tion at the point of patient care. For example, 
the integration of up-to-the-minute patient- 
specific data with agency-specific guidelines 
or parameters can produce alerts, reminders, 
or other types of notifications for immedi- 
ate action. See > Chap. 21, on patient-mon- 
itoring systems, for an overview of this topic. 
Presentation of domain information and 
knowledge related to patient care is most fre- 
quently accomplished through interaction 
with databases and knowledge bases, such as 
Medline or Micromedex. Commercial appli- 
cations such as UpToDate™ are popular 
among clinicians because they provide easy 
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INFO! 


DATA/ 


General health 
status and 
health-related needs 
of individual nations 


RMATION 


SUMMARIZE 


ABSTRACTED, 


R , 
D, AGGREGATED x 


Trends in inc 
outcomes, an 
diagnosis, 


by type of agency 


idence, prevalence, 
d costs by region, by 


ABST! 


SUMMARIZED, AGGREGATED N 


RACTED, 


Comparisons o 


and costs by locality and by agency. 
Incidence and prevalence of diagnosis by region. 


treatments, outcomes, 


ABSTRACTED, SUMMARIZED, AGGREGATED 


Number of patients a 


Outcomes for pati 


Costs of care by category of patient. 


Volume of tests, procedures, and interventions. 


dmitted with specific diagnosis. 


ents grouped by diagnosis. 


ABSTRACTED, SUMMARIZED, AGGREGATED 


“Atomic-level” patient-specific data: e.g., assessments, diagnoses, interventions, 
diagnostic test results, procedures, treatments, hours of care, outcomes. 
Used to provide most appropriate care. 


O Fig. 17.2 Examples of uses for atomic-level patient 
data collected once but used many times (Source: 
Reprinted with permission from Zielstorff et al. (1993). 


access to knowledge resources at the point of 
care. The Infobutton, developed at NewYork- 
Presbyterian Hospital, is an HL7 standard 
for context-aware knowledge retrieval (Del 
Fiol et al. 2012). Incorporated into EHRs, 
Infobuttons, along with associated man- 
agement tools, can integrate data about the 
patient and the clinical context to provide 
immediate, point-of-care access to relevant 
knowledge resources (Cimino et al. 2013). 

To support patient-centered care, informa- 
tion systems must be geared to the needs of 
all the clinicians involved in care. The systems 
should acquire, store, process, and present 
each type of information (patient-, agency-, 
and domain-specific) where, when, and how 
the information is needed by each clinician in 
the context of his or her professional domain. 
Systems designed for patient-centered care 
have the potential to go beyond supporting 
the collaborative, interdisciplinary care of 
individual patients. Through appropriate use 
of patient-specific information (care require- 
ments), agency-specific information (clini- 
cians and their responsibilities and agency 


© 1993 American Nurses Publishing, American Nurses 
Foundation/American Nurses Association, Washing- 
ton, DC. Reproduced with permission.) 


policies and procedures), and domain infor- 
mation (guidelines), such systems can greatly 
aid the coordination of interdisciplinary ser- 
vices for individual patients and the planning 
and scheduling of each caregiver’s work activ- 
ities. Patient acuity is taken into account in 
scheduling nursing personnel, but historically 
has most often been entered into a separate 
system rather than derived directly from care 
requirements as recorded in the EHR. Fully 
integrated, patient-centered systems—still an 
ideal today—would enhance our understand- 
ing of each patient’s situation, needs, and 
values, improve decision-making, facilitate 
communications, aid coordination, and use 
clinical data to provide feedback for improv- 
ing clinical processes. 

Clearly, when other clinical information 
systems designed to support patient-centered 
care fulfill their potential, they will not merely 
replace oral and paper-based methods of 
recording and communicating. They will be 
an integral and essential part of the transfor- 
mation of health care to apply evidence-based 
interventions in accordance with patient needs 
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and values. How far have we come toward the 
ideal? What must we do to continue our prog- 
ress? 


17.2 The Emergence of Patient- 
Centered Care Systems 


Events in the first decade of the Twenty-first 
Century planted the seeds of transformative 
change in patient care and clinical informat- 
ics. Over 10 years, the shared ideal of health 
care began to move from the Twentieth 
Century “doctor knows best” model toward a 
new vision of health care based on interdisci- 
plinary teams drawing on a variety of knowl- 
edge and information resources to collaborate 
with one another and with patients and fami- 
lies to resolve or alleviate health problems and 
to achieve health goals. Recursive and itera- 
tive developments grew from reports of the 
National Research Council and the Institute 
of Medicine (now known as the National 
Academy of Medicine); from government pol- 
icies and initiatives; from changes in organiza- 
tional and financial structures for health care 
delivery; and from advances in the informatics 
methods and technologies that have become 
integral to the provision, management, reim- 
bursement, and improvement of health care. 
Much remains to be done to nurture continu- 
ing development, but with care and patience 
we can begin to harvest the benefits of better 
health care and better health for individuals 
and populations. 


Publications of the National 
Academy of Sciences 


17.2.1 


With its seminal publication, To Err is Human: 
Building a Safer Health System (Kohn et al. 
2000), the Institute of Medicine startled the 
world by estimating that clinical errors were 
killing up to 98,000 hospitalized Americans 
each year. The report called for a national 
focus to advance knowledge about safety, 
reporting efforts to identify and learn from 
errors, higher standards and expectations for 
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safety, and implementation of safe practices 
and systems within health care organizations. 

The follow-on report, Crossing the Quality 
Chasm: A New Health System for the 21st 
Century (Committee on Quality of Health 
Care in America 2001), addressed the need 
for fundamental change in the health care 
delivery system. Shortcomings included inad- 
equate focus on quality and clinical infrastruc- 
tures not sufficiently developed to provide the 
full range of services needed by persons with 
chronic conditions. Significantly, the report 
placed the blame not on individual health care 
professionals, but on inadequate and broken 
systems of care. 

Crossing the Quality Chasm outlined a call 
for action by government, payers, providers, 
and the public to embrace a statement of pur- 
pose for the health care system as a whole— 
to reduce illness and improve health and 
functioning—and to adopt a shared agenda 
to achieve health care that would be safe, 
effective, patient-centered, timely, efficient, 
and equitable. The report recommended the 
redesign of care processes to achieve conti- 
nuity in care relationships; customization in 
accordance with patient needs and values; the 
sharing of knowledge, information, and deci- 
sion-making with patients; evidence-based 
decision-making; safety as a system prop- 
erty; transparency of information to facilitate 
informed decision-making by patients and 
families; anticipation of patient needs; con- 
tinuous decrease in waste; and cooperation 
among clinicians. The report gave consid- 
erable attention to informatics as an essen- 
tial methodology to achieve these aims and 
called for a renewed national commitment to 
a national health information infrastructure, 
with the elimination of most hand-written 
clinical information by 2010. 

In Patient Safety: Achieving a New Standard 
for Care (Committee on Data Standards for 
Patient Safety 2004), the Committee high- 
lighted the fact that a national health infor- 
mation infrastructure—a foundation of 
systems, technology, applications, standards, 
and policies—is required for error preven- 
tion and capture of data that facilitate local 
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and global learning from adverse events, near 
misses, and hazards. The report emphasized 
the need for data interchange standards as an 
essential building block. 

The National Academies followed these 
three reports with a number of others that 
explored in greater depth aspects of the prob- 
lems and recommendations described within 
them and made further recommendations 
for public and private actions to improve 
health care and its costs and outcomes. The 
National Research Council (Stead & Lin 
2009) published Computational Technology 
for Effective Health Care: Immediate Steps 
and Strategic Directions. This report noted 
that many health information technologies in 
the current marketplace lacked the function- 
ality to achieve the goals of improving health 
care. The central finding was that computer 
scientists, experts in health and biomedical 
informatics, and clinicians would need to col- 
laborate to create technologies that would pro- 
vide cognitive support to clinicians, patients, 
and family members as they sought to under- 
stand, resolve, or alleviate health challenges. 
The report recommended that federal and 
state governments and clinicians join forces to 
require vendors to provide systems that offer 
such “meaningful” support. 

In its Learning Health System series, the 
National Academy of Medicine’s Leadership 
Consortium for a Value & Science-Driven 
Health System synthesized the insights of a 
broad array of experts and explored the need 
for transformational change in the funda- 
mental elements of health and health care. 
Multiple volumes reflect the fundamental role 
of informatics in a learning health system: 
Clinical Data as the Basic Staple of Health 
Learning: Creating and Protecting a Public 
Good (Institute of Medicine 2010), Digital 
Infrastructure for the Learning Health System: 
The Foundation for Continuous Improvement 
in Health and Health Care — Workshop Series 
Summary (Institute of Medicine 2011), 
Digital Data Improvement Priorities for 
Continuous Learning in Health and Health 
Care — Workshop Summary (Institute of 
Medicine 2013), and Optimizing Strategies 
for Clinical Decision Support: Summary of 


Meeting Series (Tcheng et al. 2017). Several 
priorities for collaborative action from the 
last are of particular relevance to enable scal- 
able patient-centered care systems: create a 
national Clinical Decision Support (CDS) 
repository, develop tools to assess CDS effi- 
cacy, publish performance evaluations, pro- 
mote financing and measurement to accelerate 
CDS adoption, develop a multi-stakeholder 
CDS learning community to inform usability, 
and establish an investment program for CDS 
research. 


17.2.2 Federal Government 
Initiatives 


The Health Information Technology for 

Economic and Clinical Health (HITECH) 

Act provided an unprecedented federal 

investment in HIT through a series of initia- 

tives aimed at ensuring that all Americans 

benefit from EHR-supported patient-cen- 

tered care. Administered by the Office of the 

National Coordinator for Health Information 

Technology, the activities are designed: 

= To support the health care workforce 
through Regional Extension Centers for 
technical assistance for implementation of 
EHRs and training initiatives to ensure 
meaningful use of EHRs 

= To enable coordination and alignment 
within and among states (State Health 
Information Exchange Cooperative 
Agreement Program) 

= To establish connectivity to the public 
health community in case of emergencies 
(Beacon Community Program) 


In addition, two federal rules support mean- 
ingful use of EHRs. The Incentive Programs 
for EHRs rule from the Centers for Medicare 
& Medicaid Services (CMS) defines mini- 
mum requirements that hospitals and eligible 
professionals must meet through their use 
of certified technology to qualify for incen- 
tive payments. Criteria related to providing 
patients with an electronic copy of their own 
health information and ability to electronically 
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exchange key clinical information are particu- 
larly important to patient-centered care. The 
complementary Standards and Certification 
Criteria for Electronic Health Records rule 
defines the criteria for certification of the tech- 
nology. Also relevant to patient-centered care 
are NHIN Direct and NHIN CONNECT, 
which support health information exchange 
to enable patient-centered care. 

The Agency for Healthcare Research 
and Quality (AHRQ) has also invested in 
advancing patient-centered care through 
investments in HIT. A particular focus is 
promoting systems engineering approaches 
to improve patient safety through evaluation 
of clinical processes resulting in improved 
work and information flows. This is reflected 
in the AHRQ Patient Safety Learning Labs 
grant portfolio. For example, the Brigham 
and Women’s Hospital project, Making 
Acute Care More Patient-centered, focused 
on developing tools and care processes to 
engage patient, family, and professional 
care team members in reliable identification, 
assessment, and reduction of patient safety 
threats in real time before they manifest into 
actual harm. Widespread adoption of EHR 
systems and web-based technologies makes it 
possible to reuse EHR data to create visual 
displays at the bedside to provide decision 
support to the care team, patients, and fam- 
ily. © Figure 17.3 is an example of a bedside 
screensaver display used at Brigham and 
Women’s Hospital. The icons are driven by 
patient-specific information in the EHR. The 
display provides information to patients 
about their safety plan and provides decision 
support at the bedside related to patient-spe- 
cific needs to professional and paraprofes- 
sional care team members. 

Given these major investments in pro- 
moting EHR adoption and use for patient- 
centered care and research, the vision of every 
American reaping the benefits of EHRs is 
moving closer to reality. However, this will 
continue to be heavily influenced by associ- 
ated changes in health care financial and orga- 
nizational structures. 
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17.2.3 Financial and Organizational 
Structures in Health Care 


The historical evolution of information sys- 
tems that support patient care, and even- 
tually patient-centered care, is not solely a 
reflection of the available technologies (e.g. 
Web 2.0, cloud computing). Societal forces— 
including delivery-system structure, practice 
model, payer model, and quality focus—have 
influenced the design and implementation of 
patient-care systems (@ Table 17.2). 


= Delivery-System Structure 

Authors have noted the significant influence 
of the organization and its people on the 
success or failure of informatics innovations 
(Massaro 1993; Campbell et al. 2006; Ash 
et al. 2007). Others have documented unin- 
tended consequences of implementation of 
HIT and called for applications of models 
of processes, such as Iterative Sociotechnical 
Analysis, that take into account health care 
organizations’ workflows, social interactions, 
culture, etc. to further elucidate the relation- 
ship between organizations and technology 
(Harrison et al. 2007; Koppel et al. 2005; 
Ozbolt et al. 2012). As delivery systems shifted 
from the predominant single-institution struc- 
ture of the 1970s to the integrated delivery 
networks of the 1990s to the complex link- 
ages of the Twenty-First Century, the infor- 
mation needs changed, and the challenges of 
meeting those information needs increased 
in complexity. The patient centered medi- 
cal home (PCMH) (also known as primary 
care medical home, advanced primary care, 
and health care home) is a model of primary 
care that delivers care designed to be patient- 
centered, comprehensive, coordinated, acces- 
sible, and continuously improved through a 
systems-based approach to quality and safety 
(Patient Centered Medical Home Resource 
Center 2011). AHRQ and others (Bates and 


2 > http://www.ncqa.org/ (Accessed: 7/1/19). 
3 > https://pemh.ahrq.gov/ (Accessed 7.8.19). 
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O Fig. 17.3 A personalized screen saver used at 
Brigham and Women’s Hospital to display patient- 
specific safety information to patients, family, and care 


Bitton 2010) have noted the seminal role of 
HIT (e.g., health information exchange, dis- 
ease registries, alerts and reminders) to sup- 
port tasks related to National Committee 
on Quality Assurance PCMH standards for 
enhancing access and continuity, identifying 
and managing patient populations, planning 
and managing care, providing self-care and 
community support, tracking and coordinat- 
ing care, and measuring and improving per- 
formance (Patient Centered Medical Home 
2011). Accountable Care Organizations 
(ACOs) focus on care of designated popula- 
tions. A recent systematic review (Kaufman 
et al. 2019) reveals consistent associations 
between ACO implementation and outcomes 
across payer types and reduced inpatient use, 
reduced emergency department visits, and 
improved measures of preventive care and 
chronic disease management. See » Chaps. 
11, 15, and 18 for discussions of managing 
clinical information in consumer-provider 
partnerships in care, in the public health 
information infrastructure, and in integrated 
delivery systems. 


Auk for Help 
with Communde 


team. (© Brigham and Women’s Hospital, Center for 
Patient Safety, Research and Practice. Reused with per- 
mission.) 


= Professional Practice Models 

Professional practice models have also 
evolved for nurses and physicians. In the 
1970s, team nursing was the typical practice 
model for the hospital, and the nursing care 
plan—a document for communicating the 
plan of care among nursing team members— 
was most frequently the initial computer- 
based application designed for use by nurses. 
The 1990s were characterized by a shift to 
interdisciplinary-care approaches necessi- 
tating computer-based applications such as 
critical paths to support case management of 
aggregates of patients, usually with a com- 
mon medical diagnosis, across the contin- 
uum of care. The Twenty-First Century sees 
advanced practice nurses increasingly taking 
on functions previously provided by physi- 
cians while maintaining a nursing perspective 
on collaborative, interdisciplinary care. This 
trend is likely to accelerate given the recom- 
mendations for facilitating full scope of prac- 
tice for nurses and advanced practice nurses 
(e.g., certified nurse midwives, nurse prac- 
titioners) in the 2010 Institute of Medicine 
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O Table 17.2 
systems 


Delivery- 
system 
structure 


Professional- 
practice 
model 


Payer model 


Quality focus 
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Societal forces that have influenced the design and implementation of patient-centered 


1970s 


Single 
institution 


Team nursing 


Single or small 
group physician 
practice 


Fee for service 


Professional 
Standards 
Review 
Organizations 
(PSROs) 


Retrospective 
chart audit 


1980s 


Single 
organization 


Primary 
nursing 


Group models 
for physicians 


Fee for service 


Prospective 
payment 


Diagnosis- 
related groups 


Continuous 
quality 
improvement 


Joint 
Commission on 
Accreditation 
of Health Care 
Organization 
(JCAHO)’s 
agenda for 
change 


1990s 


Integrated 
delivery 
systems 


Patient-focused 
care, 

multi- 
disciplinary 
care, case 
management 


Variety of 
constellations 
of physician 
group practice 
models 


Capitation 


Managed care 


Risk-adjusted 
outcomes 


Benchmarking 


Practice 
guidelines 


Critical paths/ 
care maps 


Health 
Employer Data 
and 
Information 
Set (HEDIS) 


2000s 


Patient- 
centered care 


CMS P4P 
hospital 
initiative 


Patient safety 


Learning 
organizations 


Consumer- 
driven 


2010s 


Patient centered 
medical home 
Virtual care 


Expansion of 
nurse and 
advanced 
practice nurse 
roles to legal 
scope of practice 


Affordable Care 
Act of 2010 


Accountable 
Care 
Organizations 
Risk-bearing, 
coordinated care 
models 


Value-driven 
health care 


Patient/ 
consumer- 
centered 
outcomes 
Promoting 
interoperability 


(continued) 
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O Table 17.2 (continued) 


1970s 1980s 


General 
technology 
trends 


report on The Future of Nursing: Leading 
Change, Advancing Health (Committee on the 
Robert Wood Johnson Foundation Initiative 
on the Future of Nursing at the Institute of 
Medicine 2010). These changes broaden and 
diversify the demands for decision support, 
feedback about clinical effectiveness, and 
quality improvement as a team effort. 


Physician practice models have shifted from 
single physician or small group offices to com- 
plex constellations of provider organizations. 
The structure of the model (e.g., staff model 
health-maintenance organization (HMO), 
captive-group model health-maintenance 
organization, or independent-practice asso- 
ciation) determines the types of relationships 
among the physicians and the organizations. 
These include issues— such as location of 
medical records, control of practice patterns 
of the physicians, and data-reporting require- 
ments—that have significant implications for 
the design and implementation of patient-care 
systems. In addition, the interdisciplinary and 
distributed care approaches of the 1990s and 
the 2000s have given impetus to system-design 
strategies, such as the creation of a single 
patient problem list, around which the patient- 
care record is organized, in place of a separate 
list for each provider group (e.g., nurses, phy- 
sicians, respiratory therapists). While a shared 
single patient problem list remains a goal of 
patient-centered care, few implementations 
have achieved successful integration of the 
medical problem list with the problem list 


1990s 2000s 2010s 


World Wide 
Web (Web 1.0) 


Web 2.0 Cloud 


computing 
Digitization of 
the health care 
system, ehealth/ 
mhealth 


Social media 


“Smart” 
mobile 
devices 


from other health professions, in particular 
nursing, due to the persistence of siloed EHR 
terminology management (Collins, Bavuso 
et al. 2017; Collins, Klinkenberg-Ramirez 
et al. 2017; Collins, Rozemblum et al. 2017). 

In the 2010s the use of shared clinical 
dashboards accelerated the operationaliza- 
tion of known patient-centered best practices, 
such as interdisciplinary clinical rounding and 
safety checklists, to drive increased situational 
awareness and promote common ground in 
relation to patient goals and preferences and 
prevention of harm (Collins, Gazarian et al. 
2014; Mlaver et al. 2017). These shared, inter- 
disciplinary, team-based dashboards are typi- 
cally configured to a particular practice and 
specialty area (e.g., cardiac ICU, emergency 
department) or a clinical quality improvement 
strategic initiative (e.g., optimizing length of 
stay, promoting effective and early discharge 
planning) (Collins, Hurley et al. 2014). In the 
ambulatory setting shared dashboards are 
also used and configured to drive population 
health initiatives. 


= Payer Models 

Changes in payer models have been a sig- 
nificant driving force for information-system 
implementation in many organizations. With 
the shift from fee for service to prospective 
payment in the 1980s, and then toward capi- 
tation in the 1990s, information about costs 
and quality of care has become an essential 
commodity for rational decision-making in 
the increasingly competitive health care mar- 
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ketplace. Because private, third-party payers 
often adopt federal standards for reporting 
and regulation, health care providers and 
institutions have struggled in the early 2000s 
to keep up with the movement toward data 
and information system standards acceler- 
ated by the Health Insurance Portability 
and Accountability Act (HIPAA)* and the 
initiatives to develop a National Health 
Information Network. With the advent of 
pay for performance (P4P), CMS has elimi- 
nated reimbursement for preventable condi- 
tions (e.g., catheter-associated urinary tract 
infections) that occur during hospitalizations 
(“CMS P4P,”). In this decade, there is no 
doubt that the implementation of the highly 
controversial Affordable Care Act of 2010 
and evolving ACOs will profoundly impact 
patient-centered care and the information sys- 
tems needed to support it. 


= Quality Focus 

Demands for information about quality of 
care have also influenced the design and 
implementation of patient-care systems. 
The quality-assurance techniques of the 
1970s were primarily based on retrospective 
chart audit. In the 1980s, continuous quality 
improvement techniques became the modus 
operandi of most health care organizations. 
The quality-management techniques of the 
1990s were much more focused on concur- 
rently influencing the care delivered than 
on retrospectively evaluating its quality. In 
the Twenty-first Century, patient-centered 
systems-based approaches—such as practice 
guidelines, alerts, and reminders tailored on 
patient clinical data and, in some instances, 
genomic data (i.e., personalized medicine)— 
are an essential component of quality man- 
agement. In addition, institutions must have 
the capacity to capture data for benchmark- 
ing purposes and to report process and out- 
comes data to regulatory and accreditation 
bodies, as well as to any voluntary reporting 
programs to which they belong. Increasingly, 
concurrent feedback about the effectiveness 


4 > http://hhs.gov/ocr/privacy/ (Accessed: 4/26/13). 
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of care guides clinical decisions in real time, 
and “dashboards” are used to display indica- 
tors related to different dimensions of quality. 


Data science approaches using clinical 
data with a quality focus greatly expanded in 
the 2010s (Bates et al. 2014; Hruby et al. 2016). 
While much of this work has focused only on 
the subset of data within the EHR that are 
consistently structured and coded, such as 
medical diagnoses, allergies, and medication 
orders, there are increasing efforts to process 
and derive value from additional data sources, 
such as nursing assessment data and narrative 
notes (Klann et al. 2018) (see > Chap. 8). The 
use of high volume clinical metadata, or data 
about the clinical data, also provides value for 
quality purposes, such as developing health- 
care process models of care delivery processes, 
for a richer understanding of patient centered 
workflows and evaluation of best practices 
and gaps in care than previously feasible 
(Hripcsak & Albers 2013; Collins & Vawdrey 
2012; Collins et al. 2013). 

Of note, while advances in natural language 
processing continue, many of the comput- 
able processes outlined above, such as practice 
guidelines, alerts, and reminders, still require 
structured data capture, which typically requires 
manual data entry by clinicians (Collins, 
Couture et al. 2018). Evaluation of the value 
and downstream uses of manually captured 
structured data is an important consideration 
when configuring clinical systems to decrease 
burden on the clinician and promote ‘top of 
license’ practice. Data science approaches for 
continuous knowledge development are seen 
as an essential aspect of a Learning Health 
System. Innovations related to clinical data 
capture, such as increased capabilities and 
implementations of device-integrated data and 
voice-recognition tools, are expected to increase 
the volume, veracity, variety, and velocity of 
clinical data for data science and quality pur- 
poses. In turn, automated data capture promises 
to decrease clinician data entry burden and pro- 
vide greater opportunity for clinical data con- 
sumption, interpretation, and decision-making 
tailored to patient preferences by the care team 
as part of a patient-centered system. 
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17.2.4 Advances in Patient- 
Centered Care Systems 


The design and implementation of patient- 
care systems, for the most part, occurred 
separately for hospital and ambulatory-care 
settings. Early patient-care systems in the 
hospital settings included the University of 
Missouri-Columbia System (Lindberg 1965), 
the Problem-Oriented Medical Information 
System (PROMIS) (Weed 1975), the TriService 
Medical Information System (TRIMIS) 
(Bickel 1979), the Health Evaluation Logical 
Processing (HELP) System (Kuperman 
et al. 1991), and the Decentralized Hospital 
Computer Program (DHCP) (Ivers et al. 1983). 
Among the earliest ambulatory-care sys- 
tems were the Computer-Stored Ambulatory 
Record (COSTAR), the Regenstrief Medical 
Record System (McDonald 1976), and The 
Medical Record (TMR). For a comprehensive 
review, see Collen (1995). 

According to Collen (1995), the most 
commonly used patient-care systems in hos- 
pitals of the 1980s were those that supported 
nursing care planning and documentation. 
Systems to support capture of physicians’ 
orders, communications with the pharmacy, 
and reporting of laboratory results were also 
widely used. Some systems merged physician 
orders with the nursing care plan to provide a 
more comprehensive view of care to be given. 
This merging, such as allowing physicians and 
nurses to view information in the part of the 
record designated for each other’s discipline, 
was a step toward integration of information. 
It was still, however, a long way from sup- 
port for truly collaborative interdisciplinary 
practice. 

Early ambulatory-care systems most often 
included paper-based patient encounter forms. 
Some used a computer-scannable mark-sense 
format while others required clerical person- 
nel to type the data into the computer. Current 
desktop, laptop, or handheld systems use key- 
board, mouse, touchpad, or pen-based entry 
of structured information, with free text kept 
to a minimum. Current systems also provide 
for retrieval of reports and past records. Some 
systems provide decision support or alerts to 


remind clinicians to provide needed care, such 
as immunizations or screening examinations, 
and to avoid contraindicated orders for medi- 
cations or unnecessary laboratory analyses. 
The best provide good support for traditional 
medical care. Support for comprehensive, col- 
laborative care that gives as much attention to 
health promotion as to treatment of disease 
presents a challenge not only to the develop- 
ers of information systems but also to prac- 
titioners and health care administrators who 
must explicate the nature of this practice and 
the conditions under which agencies will pro- 
vide it. 

Patient-care information systems in use 
today represent a broad range in the evolu- 
tion of the field. Versions of some of the 
earliest systems are still in use, although 
most organizations have migrated to com- 
mercially available EHR systems. Internally 
developed systems were generally designed to 
speed documentation and to increase legibil- 
ity and availability of the records of patients 
currently receiving care. As payment models 
shifted towards population health and value- 
based care, numerous organizations and pro- 
viders merged to form healthcare networks. 
After weighing the costs and benefits associ- 
ated with further developing the internally 
developed EHR system and expanding it to 
newly acquired healthcare providers and sites 
within their network, many organizations 
made the decision that the costs and resource 
requirements for long-term investment in the 
legacy EHR system were not sustainable. For 
some organizations this was a difficult deci- 
sion because the internally developed legacy 
EHR systems had demonstrated positive 
outcomes in terms of improved quality and 
safety. However, substantial resources were 
required to support continuous development 
and modification of those systems. 

The 2009 enactment of the HITECH Act 
and its requirements for meaningful use of 
EHR systems caused many of the commer- 
cially available EHR systems to include (or to 
plan to include) the core functionality needed 
to achieve the conditions of certification and 
to meet the population health and reporting 
requirements of the HITECH Act. Many 
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legacy EHR systems had inadequate func- 
tionality to enable compliance with the new 
regulations, nor could they support emerg- 
ing value-based care models.: Migration to a 
single commercial enterprise EHR helped to 
merge and consolidate clinical data to a single 
instance (e.g., one patient, one record) regard- 
less of where a patient was seen within the 
expanding healthcare networks. 

More recently developed systems attempt 
with varying success to respond to the edict 
“collect once, use many times.” Selected items 
of data from patient records are abstracted 
manually or electronically to aggregate data- 
bases where they can be analyzed for admin- 
istrative reports, for quality improvement, 
for clinical or health-services research, and 
for required patient safety and public health 
reporting. Such functionality is a key aspect 
of the federal requirements for meaningful 
use and interoperability. See » Chap. 19 for 
a full discussion of public health informatics. 

Some recently developed systems offer 
some degree of coordination of the informa- 
tion and services of the various clinical disci- 
plines into integrated records and care plans. 
Data collected by one caregiver can appear, 
possibly in a modified representation, in the 
“view” of the patient record designed for 
another discipline. When care-planning infor- 
mation has been entered by multiple caregiv- 
ers, it can be viewed as the care plan to be 
executed by a discipline, by an individual, or 
by the interdisciplinary team. Some patient- 
care systems offer the option to organize 
care temporally into clinical pathways and to 
have variances from the anticipated activities, 
sequence, or timing reported automatically. 
Others offer a patient “view” so that indi- 
viduals can view and contribute to their own 
records. For example, patients hospitalized for 
cardiac conditions can review selected aspects 
of their records and enter data about their 
symptoms such as pain ratings into CUPID 
(Computerized Unified Patient Interaction 
Device), an iPad-based application (Vawdrey 
et al. 2011), or about their goals of care (Dykes 
et al. 2017). Today, care plans are generally 
limited to one discipline, disease, or care set- 
ting. Even after a decade of HITECH fund- 
ing, a truly integrative electronic care plan 
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that is comprehensive, cohesive, dynamic, and 
oriented around actionable patient goals and 
preferences is still not a reality. The concept 
of longitudinal care plans (LCPs) to support 
care coordination is referenced in the ACA 
and HITECH Acts and the ensuing regula- 
tions of Meaningful Use Stages 1 and 2. This 
legislation contained financial incentives and 
penalties, describing implementing LCPs 
as necessary and dependent on technical 
interoperability (McDonald et al. 2012). CMS 
also uses plans of care as part of their rules 
for eligible providers and facilities that serve 
Medicare patients with chronic conditions, 
and for performance-based incentive models 
(Agency for Healthcare Research and Quality 
2014). However, there remains no consensus 
or definition of an LCP’s contents, nor are 
there best practices for collaborating, updat- 
ing, and reconciling the care plan across set- 
tings and with the patient and family (Dykes 
et al. 2014). 

Electronic documentation of clinician 
progress notes has lagged behind other func- 
tions in EHRs (Doolan et al. 2003). The pro- 
cess of entering notes may occur through 
dictation, selecting words and phrases from 
structured lists, use of templates, and typ- 
ing free text. Amid concerns that salience 
may be lost in electronic notes (Siegler 2010), 
Johnson et al. (2008) advocated for a hybrid 
approach that combines semi-structured data 
entry and natural language processing within 
a standards-based and computer-processible 
document structure. Thus, ability for data 
re-use is preserved while maintaining clini- 
cian efficiency and expressivity. Some prog- 
ress has been made with sharing EHR notes 
with patients. OpenNotes is a national ini- 
tiative to share clinician notes with patients. 
Early research indicates that clinician note 
transparency supports patient-centered care, 
empowers informal caregivers, and engages 
less educated and diverse patients (Chimowitz 
et al. 2018). 

The publication of the Institute of 
Medicine’s reports To Err is Human (2000) and 
Crossing the Quality Chasm (2001) resulted in 
increasing demands from health care provid- 
ers for information systems that reduce errors 
in patient care. Information system vendors 
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are responding by developing such systems 
themselves and by purchasing the rights to 
patient care systems developed in academic 
medical centers that have demonstrated 
reductions in errors and gains in quality of 
care and cost control. Closed loop medica- 
tion systems use technologies such as bar 
codes and decision support to guard against 
errors throughout the process of prescribing, 
dispensing, administering, and recording and 
have been identified as a key intervention to 
improve medication safety. In a before-and- 
after evaluation of the closed loop electronic 
medication administration system at Brigham 
and Women’s Hospital, investigators found a 
significant reduction in the rates of transcrip- 
tion errors, medication errors, and potential 
adverse events (Poon et al. 2010). In other 
contexts, decision support systems offer clini- 
cal practice guidelines, protocols, and order 
sets as a starting point for planning indi- 
vidualized patient care; providing alerts and 
reminders; using knowledge bases and patient 
data bases to assess orders for potential con- 
traindications; and offering point-and-click 
access to knowledge summaries and full-text 
publications. See > Chap. 26 for more infor- 
mation about these systems. 

Many health care organizations have sub- 
stantial investment in legacy systems and 
cannot simply switch to more modern tech- 
nology. Finding ways to phase the transition 
from older systems to newer and more func- 
tional ones is a major challenge to health 
informatics. To make the transition from a 
patchwork of systems with self-contained 
functions to truly integrated systems with 
the capacity to meet emerging information 
needs is even more challenging (see >» Chap. 
16). Approaches to making this transition 
are described in the Journal of the American 
Medical Informatics Association (Stead et al. 
1996). More recently, some institutions have 
applied Web 2.0 approaches to create con- 
figurable user interfaces to legacy systems. 
For example, MedWISE integrates a set of 
features that supports custom displays, plot- 
ting of selected clinical data, visualization of 
temporal trends, and self-updating templates 
as mechanisms for facilitating cognition dur- 


ing the clinical decision making and docu- 
mentation process (Senathirajah, Bakken, & 
Kaufman 2014; Senathirajah, Kaufman, & 
Bakken 2014). 

If patient-centered care systems are to 
be effective in supporting better care, health 
care professionals must possess the infor- 
matics competencies to use the systems. 
Consequently, many are integrating informat- 
ics competencies into health science education 
(See > Chap. 25). For example, the Quality 
and Safety Education for Nurses initiative has 
named and described necessary competencies 
and associated curriculum to support patient- 
centered care, including competencies related 
to quality, safety, team work, and collabora- 
tion (Cronenwett et al. 2009). Recently, infor- 
matics competencies for nursing leaders were 
validated (Yen et al. 2017). 

To what degree do patient-care disciplines 
need to prepare their practitioners for roles 
as informatics specialists? To the degree that 
members of the discipline use information in 
ways unique to the discipline, the field needs 
members prepared to translate the needs of cli- 
nicians to those who develop, implement, and 
make decisions about information systems. If 
the information needs are different from those 
of other disciplines, some practitioners should 
be prepared as system developers. 

The mere existence of information systems 
does not improve the quality of patient care. 
The adoption and use of advanced features 
(such as CDS) that are sensitive to both work- 
flow and human factors are needed to improve 
the quality of care (Stead & Lin 2009; Zhou 
et al. 2009). Recent safety reports, public pol- 
icy, and reimbursement incentives raise aware- 
ness of the need for patient-centered care 
systems. Because traditional requirements for 
EHRs were provider-centric, existing infor- 
mation systems rarely provide the compre- 
hensive suite of advanced features needed to 
support patient-centered care. However, the 
ability of systems to support patient-centered 
care is essential for achieving the vision of 
health care reform. What are the requirements 
for patient-centric information systems? How 
do these requirements drive the design of sys- 
tems that will support patient-centered care? 
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17.3 Designing Systems 
for Patient-Centered Care 


In the second decade of the Twenty-first 
Century, informaticians and clinicians increas- 
ingly share a vision for systems that support 
patient-centered care practices such as inter- 
disciplinary care planning, care coordination, 
quality reporting, and patient engagement. 
This evolution is fueled in part by meaningful 
use requirements that aim to engage patients 
and families in their health care and to improve 
care coordination and the overall quality of 
care provided. Traditional EHR functional- 
ity must be expanded to support new features, 
functions, and care practices including seam- 
less communication, interdisciplinary collabo- 
ration, and patient access to information. To 
optimize human and organizational factors 
and the integration of systems and workflow, 
these features must be built into information 
systems as core requirements, rather than as an 
afterthought. 

The Principles to Guide Successful Use of 
Health Care Information Technology described 
by the National Research Council (Stead & 
Lin 2009) provide a comprehensive frame- 
work for defining a set of core requirements 
that will support the design of systems for 
patient-centered care. This framework defines 
nine principles related to both evolutionary 
(i.e., iterative, long-term improvements) and 
radical (i.e., revolutionary, new-age improve- 
ments) changes occurring in the United States’ 
health care system. The principles and associ- 
ated system design prerequisites are included 
in @ Table 17.3. 


17.4 Current Research Toward 
Patient-Centered Care 
Systems 


Friedman (1995) proposed a typology of the 

science in medical informatics. His four cat- 

egories build from fundamental conceptual- 

ization to evaluation as follows: 

= Formulating models for acquisition, repre- 
sentation, processing, display, or transmis- 
sion of biomedical information or knowledge 
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= Developing innovative computer-based 
systems, using these models, that deliver 
information or knowledge to health care 
providers 

= Installing such systems and then making 
them work reliably in functioning health 
care environments 

= Studying the effects of these systems on 
the reasoning and behavior of health care 
providers, as well as on the organization 
and delivery of health care 


While the Friedman typology continues to 
be useful more than 20 years after its incep- 
tion, we propose extending the second and 
third categories in > Sects. 17.4.2 and 17.4.3 
to expand the focus from clinical informatics 
as a provider-centric discipline to a discipline 
that enables and supports patient-centric care. 
Following are examples of recent research 
with implications for patient-centered care. 


17.4.1 Formulation of Models 


For several decades, standards development 
organizations (SDOs) and professional groups 
alike have focused on the formulation of mod- 
els that describe the patient care process and 
the formal structures that support manage- 
ment and documentation of patient care. The 
efforts of SDOs are summarized in > Chap. 8. 
Early SDO efforts focused primarily on rep- 
resenting health care concepts such as pro- 
fessional diagnoses (e.g., medical diagnoses, 
nursing diagnoses) and actions (e.g., proce- 
dures, education, referrals). These efforts were 
complemented by professional efforts such as 
those of the Nursing Terminology Summit 
(Ozbolt 2000). As a result of multi-national 
efforts, SNOMED CT became an interna- 
tional standard that provides a formal model 
for concepts that describe clinical conditions 
and the actions of the multidisciplinary health 
care team (International Health Terminology 
Standards Development Organization 2011). 
In addition, SNOMED CT subsets have been 
developed for specific domains such nurs- 
ing problems (Matney et al. 2012). Toward 
the goal of patient-centered care, attention 
has also been paid to approaches for formal 
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O Table 17.3 Principles to guide successful use of health care information technology 


Evolutionary 
change 


Radical 
change 


Principle 


il. 


w 


in 


N 


co 


Focus on improvements in 
care—technology is secondary 


. Seek incremental gain from 


incremental effort 


. Record available data so they 


can be used for care, process 
improvement, and research 


. Design for human and 


organizational factors 


. Support the cognitive 


functions of all caregivers, 
including health professionals, 
patients, and their families 


Architect information and 
workflow systems to 
accommodate disruptive 
change 


. Archive data for subsequent 


re-interpretation 


. Seek and develop technologies 


that identify and eliminate 
ineffective work processes 


. Seek and develop technologies 


that clarify the context of data 


System design prerequisites 


Gaps in patient-centered care are clearly defined and 
operationalized. Health care IT is employed to enable 
the process changes needed to close gaps in 
patient-centered care. 


An organization’s portfolio of health care IT projects 
has varying degrees of investment. Each project is 
linked to measurable process changes to provide 
ongoing visible success with closing gaps in 
patient-centered care. 


Health care IT systems support auto capture of data 
about people, processes, and outcomes at the point of 
care. Data are used in the short term to support 
incremental improvements in patient-centered care 
processes. An expandable data collection infrastructure 
is employed that is responsive to future needs that 
cannot be anticipated today. 


Clear consideration is given to sociological, 
psychological, emotional, cultural, legal, economic, and 
organizational factors that serve as barriers and 
incentives to providing patient-centered care. Health 
care IT should eliminate the barriers and enable the 
incentives, making it easy to provide patient-centered 
care. 


Health care IT systems include advanced clinical 
decision support for high-level decision-making that is 
sensitive to both workflow and human factors. 


Health care IT systems are designed using standard 
interconnection protocols that support the 
patient-centered care processes and roles of today while 
accommodating rapidly changing requirements dictated 
by new knowledge, care venues, policy, and increasing 
patient engagement. 


Health care IT systems support archival of raw data to 
enable ongoing review and analysis in the context of 
advances in biomedical science and patient-centered 
care practices. 


Health care IT system design is preceded by a thorough 
investigation of current and future state work processes 
of all stakeholders (including patients and their 
families). Health care IT systems support efficient 
workflows that leverage ubiquitous access to 
information and communication and are not 
constrained by existing care venues or provider-centric 
practice patterns. 


Health care IT systems facilitate patient-centered care by 
presenting information in context with patient values 
and preferences and in a format that is understandable 
and actionable. 
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representation of terms that patients use to 
describe their problems and actions (Doing- 
Harris and Zeng-Treitler 2011). 

In more recent years, the focus has turned 
to the development of information models 
(e.g., clinical elements model) (Oniki et al. 
2016) and formal document structures that 
support sharing of data across heterogeneous 
information systems and care coordination. In 
terms of formal document structures, Logical 
Observation Identifiers Names and Codes 
(LOINC) provides a formal naming conven- 
tion for document titles and sections (Hyun 
et al. 2009; Rajamani et al. 2015) and docu- 
ments are represented according to the HL7 
Consolidated Clinical Document Architecture 
(C-CDA) standard including the Release 2 Care 
Plan and the Continuity of Care Document 
(CCD). Matney and colleagues illustrated 
the application of the C-CDA to support the 
nursing process (Matney, Warren et al. 2016; 
Matney, Dolin et al. 2016). A CCD designed 
specifically for low socioeconomic status per- 
sons living with HIV/AIDS (PLWH) enrolled 
in a special needs plan was implemented 
for viewing by PLWH, their clinicians, and 
case managers to promote coordination and 
quality of care (Schnall, Cimino et al. 2011; 
Schnall, Gordon et al. 2011). More recently, 
a set of scalable, standards-based approaches 
has been developed to support interaction 
of external systems with the native functions 
of vendor-based EHRs. The Fast Health 
Interoperability Resource (FHIR), an HL7 
standard, has gained traction as a mechanism 
for information exchange using a well-defined 
and limited set of resources. Of particular rel- 
evance to patient-centered care, Lee and col- 
leagues (2016) developed a FHIR profile for 
cross-system exchange of a full pedigree-based 
family health history for applications used by 
clinicians, patients, and researchers. Built upon 
FHIR, the Substitutable Medical Applications 
and Reusable Technologies (SMART) platform 
enables EHR systems to behave as ‘iPhone-like 
platforms’ through an application program- 
ming interface and a set of core services that 
support easy addition and deletion of third 
party apps, such that the core system is stable 
and the apps are substitutable (Mandel et al. 
2016). CDS Hooks is designed to invoke exter- 
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nal CDS services from within the EHR work- 
flow based upon a triggering event.’ Services 
may be in the form of (a) information cards — 
provide text for the user to read; (b) suggestion 
cards — provide a specific suggestion for which 
the EHR renders a button that the user can 
click to accept, with subsequent population of 
the change into the EHR user interface; and 
(c) app link cards — provide a link to an app. 


17.4.2 Development of Innovative 
Systems 


For the purposes of developing innovative 
patient-centered care systems, the second cat- 
egory of the Friedman Typology described in 
> Sect. 19.4 is expanded to address the use of 
models that deliver information or knowledge 
to both health care providers and patients. 
Consumers regularly use information and 
communication technology to support deci- 
sion making in all aspects of their lives. 
However, access to tools to support health 
care decision making is suboptimal (Krist and 
Woolf 2011). Krist et al. (2010) proposed five 
levels of functionality for patient-centered 
health information systems. 
= Level 1: Collects patient information 
related to health status, behaviors, medica- 
tions, symptoms, and diagnoses (e.g., elec- 
tronic version of traditional paper records 
maintained by patients) 
= Level 2: Integrates patient information 
with clinical information (e.g., personal 
health record tethered to an EHR) 
= Level 3: Interprets information to provide 
context in an appropriate level of health 
literacy 
= Level 4: Provides tailored recommenda- 
tions based on patient information, clini- 


cal information, and evidence-based 
guidelines 
= Level 5: Facilitates patient decision- 


making, ownership, and action 


The levels of functionality needed to sup- 
port patient-centered health information 


5 > https://cds-hooks.org/(Accessed 5/31/18). 
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systems relate directly to several of the 
Principles to Guide Successful Use of Health 
Care Information Technology described by 
the National Research Council (Stead & Lin 
2009) and outlined in » Sect. 17.3; specifi- 
cally principles 5, 8 and 9 (see © Table 17.3). 

Partners Health Care System in Boston, 
MA, developed Patient Gateway, a secure 
patient portal serving over 65,000 patient 
users from primary and specialty care prac- 
tices affiliated with the Dana Farber Cancer 
Institute, Brigham and Women’s Hospital, 
and Massachusetts General Hospital. Patient 
Gateway is a tethered personal health record 
(see > Chap. 11) that provides functionality 
in line with the five levels described by Krist 
and Woolf. For example, tools for manage- 
ment of chronic illness are used by patients 
and providers to promote adherence with 
evidence-based health maintenance guidelines 
and to improve collaboration on diabetes self- 
management plans (Grant et al. 2006; Wald 
et al. 2009). Research on patient response and 
satisfaction with the Patient Gateway suggests 
that patients appreciate the ability to com- 
municate electronically with providers, they 
welcome greater access to their health infor- 
mation including test results, and they believe 
that Patient Gateway enables them to better 
prepare for visits (Grant et al. 2006; Schnipper 
et al. 2008; Wald et al. 2009). Evaluations 
of patient satisfaction with personal health 
records with similar levels of functional- 
ity at other sites, including Geisinger Health 
System (Hassol et al. 2004), Group Health 
Cooperative (Ralston et al. 2007), and Virginia 
Commonwealth University (Krist et al. 2010), 
are consistent with the results reported at 
Partners Health Care System. 

While traditionally patient portals have 
been associated with ambulatory care, some 
health care systems are providing modules 
within their patient portals to inform and 
activate patients and family during an acute 
hospitalization and associated transitions 
(Grossman et al. 2018). Early research indi- 
cates that in addition to the features com- 
monly found in ambulatory patient portals 
(e.g., medications, labs, educational content, 
scheduling features), the acute care modules 
provide patient access to the plan of care, 


daily schedules, and direct care team com- 
munication. Some inpatient portals support 
patient-generated content such as notes, 
patient-provider messaging, and patient 
feedback related to their care plan or dis- 
charge plan. Research is needed on the use 
of acute care portals to explore their impact 
on patients’ abilities to successfully navigate 
information-rich acute care hospitalizations 
and to examine the effects of portal use on 
patient activation, engagement, and the over- 
all quality of care (Grossman et al. 2018). 
The involvement of users has been identi- 
fied as fundamental to well-designed systems 
that are usable and useful in the context of 
busy patient care workflows (Rahimi et al. 
2009). Some examples of development activi- 
ties where user involvement is needed are con- 
tent standardization, workflow modeling, and 
usability testing. 
= Content standardization: Content stan- 
dardization includes identifying EHR con- 
tent needed to support documentation of 
care provided and identification of data 
needed for reuse (e.g., decision support, 
quality reporting, and research). Content 
that is shared across disciplines and 
patients is identified. Content is modeled 
using standards to ensure data reuse and 
interoperability (Principle 3, @ Table 17.3) 
(Chen et al. 2008; Dykes et al. 2010; Kim 
et al. 2011). 
= Workflow modeling: Sound modeling of 
the clinical workflow that underlies an 
electronic system is essential to designing 
systems that are usable by care team mem- 
bers (Peute et al. 2009). Workflow models 
are based on observations of current state 
clinical workflows including interactions 
with patients, staff, equipment, and sup- 
plies. Understanding of workflow interac- 
tions, including current state inefficiencies, 
informs effective and efficient future state 
workflows, use-case development, and sys- 
tem prototypes (Rausch & Jackson 2007; 
Mlaver et al. 2017). Workflow modeling of 
patient-centered systems includes clear 
evaluation of ways to use technology to 
identify and eliminate ineffective work 
processes (Principle 8, @ Table 17.3). 
Design of new systems is an opportunity 
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to provide ubiquitous access to informa- 
tion and communication by all care team 
members including patients. Applying 
these principles in workflow modeling 
assures that future state workflows are not 
constrained by existing care venues or pro- 
vider-centric practice patterns. 

= Usability testing: A key lesson learned 
from Computerized Physician Order Entry 
implementations is that electronic systems 
with poor usability interfere with clinical 
workflow. The unintended consequences 
of poorly designed systems are well 
known, and some widely disseminated 
papers (Ash et al. 2004, 2009; Koppel et al. 
2005) have called into question the safety 
of using such systems with patients. 
Examples of common usability problems 
include overly cluttered screen design, 
poor use of available screen space, and 
inconsistencies in design. Involving end- 
users in design and enforcing usability 
design standards when building clinical 
systems prevents implementing systems 
that are difficult to use and interfere with, 
rather than support, patient-centered care 
(Principles 4, 5 and 8, B Table 17.3). 


Innovative systems to support patient care 
often take advantage of information entered 
in one context for use in other contexts. 

For example, the Brigham and Women’s 
Hospital Patient Safety Learning Lab in 
Boston developed a provider checklist and a 
patient-centered toolkit that used informa- 
tion from the order entry, scheduling, flow- 
sheets (nursing documentation), and other 
systems to auto-populate a suite of tools used 
by clinicians and patients to improve team 
communication and patient safety. The itera- 
tive, participatory development process led to 
tools that are used every day in the medical 
intensive care units and that demonstrated 
significant reductions in adverse events and 
improvement in patient and family satisfac- 
tion (Mlaver et al. 2017). 

The principle of entering information 
once for multiple uses also drove development 
of the bedside displays for inpatients and the 
care team at Brigham and Women’s Hospital 
(Duckworth et al. 2017). A patient safety 
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plan dashboard was developed that captures 
disparate data from the EHR and presents 
a personalized display as the screensaver on 
the bedside computer workstation. The dash- 
board aligns all care team members, including 
patients and families, in the safety plan. The 
screensaver content includes icons that pro- 
vide actionable alerts related to patient-spe- 
cific safety concerns. These bedside displays 
combine data from many sources to support 
the integrated care of physicians, nurses, and 
family members. 

At Partners Health Care System, system 
developers are working with clinical teams 
to identify system requirements, to iteratively 
develop, and to test patient-centric systems 
that integrate decision support into the clini- 
cal workflow. For example, Dykes et al. (2010) 
developed a fall prevention toolkit that reuses 
fall risk assessment data entered into the 
clinical documentation system by nurses and 
automatically generates a tailored set of tools 
that provide decision support to all care team 
members, including patients and their family 
members at the bedside (Dykes et al. 2009). 
The fall prevention toolkit logic was developed 
from focus groups of professional and para- 
professional caregivers (Dykes et al. 2009), 
and of patients and family members (Carroll 
et al. 2010). As nurses complete and file the 
routine fall risk assessment scale, the docu- 
mentation system automatically generates a 
tailored bed poster that alerts all team mem- 
bers about each patient’s fall risk status and 
patient-appropriate interventions to mitigate 
risk. In addition, a patient education handout 
is generated that identifies why each individual 
patient is at risk for falls and what the patient 
and family members can do while in the hos- 
pital to prevent a fall. The icons used in the 
Fall TIPS poster and patient education hand- 
out have been developed and validated using 
a participatory design process with clinicians 
and patients (Leung et al. 2017; Hurley et al. 
2009). In a randomized control trial of over 
10,000 patients, the toolkit was associated with 
a 25% reduction in falls (Dykes et al. 2010). 
The Fall TIPS toolkit reduced falls by leverag- 
ing HIT to complete the three-step fall preven- 
tion process: (1) conduct fall risk assessments, 
(2) develop tailored fall prevention plans with 
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evidence-based interventions, and (3) consis- 
tently implement the plan. We learned that 
Fall TIPS was most effective at reducing falls 
and related injuries when patients and family 
were engaged in all three steps of the fall pre- 
vention process (Dykes et al. 2017). 


17.43 Implementation of Systems 


Much has been written about HIT failures and 
associated costs and consequences (Bloxham 
2008; Booth 2000; McManus & Wood-Harper 
2007; Ornstein 2003; Rosencrance 2006). 
Higgins and associates (see Rotman et al. 
1996) described the lessons learned from a 
failed implementation of a computer-based 
physician workstation that had been designed 
to facilitate and improve ordering of medica- 
tions. Those lessons are not identical to, but 
are consistent with, the recommendations of 
Leiner and Haux (1996) in their protocol for 
systematic planning and execution of projects 
to develop and implement patient-care systems. 
While long term follow-up of a vendor EHR 
implementation with advanced CDS identified 
lower prescribing error rates, achieving prior 
levels of perceived prescribing efficiency took 
nearly 2 years (Abramson et al. 2013, 2016). 
In response to evidence of unintended 
consequences and clinicians voicing concerns 
after system implementations, the American 
Medical Informatics Association (AMIA) 
EHR 2020 Task Force on the Status and 
Future Directions of EHRs published a report 
in 2015 that outlined 10 recommendations 
that span five areas. These recommendations 
are in response to current barriers to quality 
care delivery experienced at many organiza- 
tions after EHR system implementations. The 
five areas addressed were: simplify and speed 
documentation, refocus regulation, increase 
transparency and streamline certification, fos- 
ter innovation, and support person-centered 
care delivery. Specific recommendations dis- 
cussed decreasing documentation burden, 
improving the designs of interfaces so that 
they support and build upon how people 
think (i.e., cognitive-support design), and pro- 
moting the integration of EHRs into the full 
social context of care, moving beyond acute 


care and clinic settings to include home health, 
specialist care, laboratory, pharmacy, popula- 
tion health, long-term care, and physical and 
behavioral therapies (Payne et al. 2015). 
Several studies have quantified documen- 
tation burden. In one setting, resident phy- 
sicians spent 85 minutes per day authoring 
and viewing notes (Hripcsak et al. 2011). In 
another setting, on average, nurses perform 
631-662 manual flowsheet data entries per 
12-hour shift (excluding device integrated 
data), averaging to one data point every 0.82- 
1.14 minutes in acute care (Collins, Couture 
et al. 2018). Further, EHR log file analyses 
indicate nurses spend 21.4-38.2 minutes per 
day authoring notes, on average (Collins, 
Couture et al. 2018), yet fewer than 20% of 
nursing notes were read by physicians, and 
only 38% were read by other nurses (Hripcsak 
et al. 2011). There is an overall lack of stan- 
dardization and consistency of data defini- 
tions within EHRs, leading to a proliferation 
of data elements that contribute to EHR 
burden and inhibit interoperability and auto- 
mated reporting (Zhou et al. 2016; Collins, 
Bavuso et al. 2017; Collins, Klinkenberg- 
Ramirez et al. 2017; Collins, Rozemblum et 
al. 2017). Methods to increase consistency of 
data definitions have been published, but are 
often minimally implemented due to project 
timelines and limited resources (Collins et al. 
2015, Collins, Bavuso et al. 2017, Collins, 
Klinkenberg-Ramirez et al 2017, Collins, 
Rozemblum et al 2017). Efforts by national 
organizations to improve consistency of data 
definitions include the American Medical 
Association’s Integrated Health Model 
Initiative’ and collaborations between the 
Office of the National Coordinator for Health 
Information Technology (ONC) and CMS.’ 
As these experiences demonstrate, the 
implementation of patient-care systems is 
far more complex than the replacement of 
one technology with another. Such systems 


6 AMA. Integrated Health Model Initiative, 2018. 
> https://ama-ihmi.org (Accessed 9/25/18). 

7 ONC/CMS Reducing Clinician Burden Meeting. 
February 22, 2018. » https://www.healthit.gov/ 
news/events/onccms-reducing-clinician-burden- 
meeting (Accessed 9/25/18). 
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transform work and organizational relation- 
ships. If the implementation is to succeed, 
attention must be given to these transforma- 
tions and to the disruptions that they entail. 
Southon et al. (1997) provided an excellent 
case study of the role of organizational 
factors in the failed implementation of a 
patient-care system that had been successful 
in another site. 

To realize the promise of informatics for 
health and clinical management, people who 
develop and promote the use of applications 
must anticipate, evaluate, and accommodate 
the full range of consequences. In early 2003, 
these issues came to the attention of the public 
when a large academic medical center decided 
to temporarily halt implementation of its 
CPOE system due to mixed acceptance by the 
physician staff (Chin 2003; Ornstein 2003). 
A case series study by Doolan et al. (2003) 
identified five key factors associated with suc- 
cessful implementation: (1) having organiza- 
tional leadership, commitment, and vision; 
(2) improving clinical processes and patient 
care; (3) involving clinicians in the design and 
modification of the system; (4) maintaining 
or improving clinical productivity; and (5) 
building momentum and support amongst 
clinicians. A collaboration of ten AMIA 
working groups and the International Medical 
Informatics Association Working Group on 
Organizational and Social Issues cospon- 
sored a workshop to review factors that lead 
to implementation failure. These include poor 
communication, complex workflows, and fail- 
ure to engage end-users in clearly defining 
system requirements. Recognizing that the 
problems encountered in failed implementa- 
tions tend to be more administrative than 
technical, they recommended the following set 
of managerial strategies to overcome imple- 
mentation barriers (1) provide incentives for 
adoption and remove disincentives; (2) iden- 
tify and mitigate social, IT, and leadership 
risks; (3) allow adequate resources and time 
for training before and after implementation, 
including ongoing support; and (4) learn from 
the past and from others about implementa- 
tion successes and failures and about how 
failing situations were turned around (Kaplan 
and Harris-Salamone 2009). 
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In 2014, the SAFER (Safety Assurance 
Factors for Electronic Health Record 
Resilience) Guides were first published to 
facilitate proactive risk assessments of EHR 
safety and usability related policies, processes, 
procedures, and configurations at healthcare 
organizations (Ash et al. 2016). These guides 
are endorsed by the ONC and available at 
> HealthIT.gov. The SAFER Guides include 
nine guides organized into three broad groups: 
foundational guides, infrastructure guides, 
and clinical process guides. Recent evalua- 
tions indicate that health organizations would 
benefit from broader implementation of these 
guides and principles of safety and usability 
(Ash et al. 2016; Sittig et al. 2018). 

For the purposes of promoting successful 
implementation of patient-centered systems, 
the third category of the Friedman typology is 
expanded to provide access to information to 
all team members including patients and their 
families or caregivers outside of traditional 
health care settings as follows: Installing such 
systems and then making them work reliably in 
functioning health care environments and other 
settings where information is needed to promote 
health and wellbeing. The majority of self- 
management occurs outside traditional health 
care settings. As noted in © Table 17.3, a pre- 
requisite for patient-centered systems is that 
they support efficient workflows with ubiqui- 
tous access to information and communication 
and that the systems are not constrained by 
existing care venues or provider-centric practice 
patterns (Principle #8). Clinical workflows are 
highly complex and data-rich, requiring formal 
analysis and evaluation before and after system 
implementation. For example, a time-motion 
study found that, on average, nurses engage in 
31 communications and 52 hands-on tasks per 
hour, and multi-task 18.63% of the time (Yen 
et al. 2016). Strategies to involve users in sys- 
tem design or selection and customization will 
support successful implementation of systems 
that meet user expectations (Burley et al. 2009; 
Rahimi et al. 2009; Saleem, Russ, Justice et al. 
2009; Saleem, Russ, Sanderson et al. 2009). 
User involvement in defining future workflows 
contributes to a shared understanding about 
the impact of information systems on clinical 
tasks and workflows (Leu et al. 2008). 
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Careful attention to the Principles to Guide 
Successful Use of Health Care Information 
Technology during system design will support 
successful implementation. For example, the 
principles related to evolutionary health care 
changes keep the focus on designing and imple- 
menting usable systems that enable patient- 
centered care practices. Principles related to 
radical change focus on development of flex- 
ible, adaptable systems that are architected to 
accommodate disruptive change and iterative 
development based on end-user feedback. 


17.4.4 Effects of Clinical 
Information Systems 
on the Potential 
for Patient-Centered Care 


Electronic health records and CPOE systems 
are intended to support safe, evidence-based, 
patient-centered care by examining patient- 
specific information, agency-specific infor- 
mation, and domain-specific information in 
the clinical context and proposing appropri- 
ate courses of action or alerting clinicians 
to potential dangers. Many current systems, 
however, fail to follow design principles that 
take into account the real contingency-driven, 
non-linear, highly interrupted, collaborative, 
cognitive, and operational workflow of clini- 
cal practice (Ash et al. 2004). These flaws can 
lead to errors in entering and retrieving infor- 
mation, cognitive overload, fragmentation of 
the clinical overview of the patient’s situation, 
lack of essential operational flexibility, and 
breakdown of communication. Physicians, in 
particular, have found themselves chagrined 
by changes in the power structure as they have 
devoted more time to entering information 
and orders while other members of the health 
team have gained greater access to information 
and the concomitant capacity to make certain 
decisions without consulting the physician. 
Clinicians across the range of professions 
have expressed concern about the decrease 
in face-to-face communication with its ver- 
bal and non-verbal richness, negotiation, and 
redundant safety checks as more and more 
clinical information is exchanged via the com- 


puter (Campbell et al. 2006; Ash et al. 2007). 
These and other unintended consequences of 
EHRs and CPOE systems are the subject of 
ongoing research. Detecting and finding ways 
to prevent or mitigate the adverse, unintended 
consequences of these systems will be critical 
for supporting patient-centered care. 

A number of unintended consequences 
stem from the incompatibility of system 
design with the clinician’s cognitive workflow. 
For example, systems that make it difficult 
to find and retrieve information can inter- 
fere with patient-centered care. In a hospital 
preparing to implement a commercial CPOE 
system, investigators compared the efficiency, 
usability, and safety of information retrieval 
using the vendor’s system, the current paper 
form, and a prototype CPOE developed on 
principles of User Centered Design. They 
found the prototype system to be similar to 
the paper form and both to be significantly 
superior to the vendor’s system in efficiency, 
usability, and safety (Chan et al. 2011). 

Other unintended consequences arise 
from over-reliance on the information system 
because of limited understanding of its design 
and capacities. To date, many CDSS are some- 
what limited in their ability to incorporate 
patient-specific data into their decision algo- 
rithms. A synthesis of 17 systematic reviews 
conducted with sound methodology found 
that CDSS often improved providers’ perfor- 
mance, especially in medication orders and 
preventive care. The reviewers noted, “These 
outcomes may be explained by the fact that 
these types of CDSS require a minimum of 
patient data that are largely available before 
the advice is (to be) generated: at the time 
clinicians make the decisions” (Jaspers et al. 
2011, p. 327). 

On the other hand, many systems offer 
functionalities that support patient-centered 
care. An important component of patient- 
centered care is the application of evidence in 
a plan of care tailored to the patient’s needs. 
To increase the use of evidence-based order 
sets, investigators at Sinai-Grace Hospital in 
Detroit, Michigan embedded into the general 
admission order set specific, evidence-based 
order sets for the most common primary and 
secondary diagnoses for patients admitted to 
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their medical service. The result was a fivefold 
increase in the use of evidence-based order 
sets in the 16-month period following imple- 
mentation (Munasinghe et al. 2011). 

As in the examples above, most systems 
to support the cognitive workload have been 
directed toward physicians. Patient-centered 
care requires a broader perspective. A study 
of the information needs of case managers 
for PLWH found that the most frequent needs 
were for patient education resources (33%), 
patient data (23%), and referral resources 
(22%) (Schnall, Cimino et al. 2011). The inves- 
tigators recommended that targeted resources 
to meet these information needs be provided in 
EHRs and continuity of care records through 
mechanisms such as the Infobutton Manager. 

Key to patient-centered care is communi- 
cation among all the health care professionals 
on the patient’s team. A study at one academic 
medical center (Hripcsak et al. 2011) reviewed 
EHRs of hospitalized patients, along with 
usage logs, to make inferences about time 
spent writing and viewing clinical notes and 
patterns of communication among team mem- 
bers. In this setting, the core team for each 
patient consisted of one or more attending 
physicians, residents, and nurses, with social 
workers, dieticians, and various therapists 
joining the team later. Results showed that 
clinical notes were more likely to be reviewed 
within the same professional group, with 
attending physicians and residents viewing 
notes from nurses or social workers less than 
a third of the time. The investigators proposed 
that it might be useful to develop ways for 
EHRs “to summarize information and make 
it readily available, perhaps with the ability 
of the author to highlight information that 
may be critical and that has a high priority for 
communication” (p. 116) (Hirsch et al. 2014). 
They also noted that their study was limited to 
communications within the EHR and did not 
take into account face-to-face or telephone 
communications that might have occurred, 
especially in urgent situations. They suggested 
further research involving direct observation 
of clinicians, time-motion analyses, and think- 
aloud methods to develop deeper knowledge 
of how clinicians communicate about patient 
care, especially across the professions. 
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The quality of documentation tools can 
have a profound effect on whether informa- 
tion, even if communicated face-to-face, is 
acted upon in clinical care (Collins et al. 2011). 
In a neurovascular intensive care unit, thera- 
peutic goals for patients were stated during 
daily interdisciplinary rounds. In this setting, 
the interdisciplinary team treated the attend- 
ing physician’s note as a common patient- 
focused source of information. Although the 
attending physician’s note contained 81% of 
the stated ventilator weaning goals, it included 
only 49% of the stated sedation weaning 
goals. Overall, nearly a quarter of stated goals 
were not documented in the note. If a goal 
was not documented, it was 60% less likely 
to have a related action documented. Nurses’ 
documentation rarely mentioned the goals, 
even if actions recorded were consistent with 
the goals as stated during rounds. Notably, 
the nurses’ structured documentation sys- 
tem did not support sedation-related goals, 
even though sedation weaning was a nurs- 
ing responsibility in this setting. The authors 
noted that the frequent omission of sedation 
goals from the attending physician’s note 
might be because this nursing function was 
not a billable goal or act. They also expressed 
concern that the omission from the EHR of 
evidence of important clinical judgments 
nurses make could impair patient safety, qual- 
ity improvement, and development of nursing 
knowledge. Thus, in this example, although 
the interdisciplinary team was collaborating 
in setting and reaching therapeutic goals, defi- 
ciencies in their processes and in the nurses’ 
documentation system limited their achieve- 
ment of patient-centered care. 

A study at Vanderbilt University Medical 
Center also demonstrated both strengths and 
shortcomings in the ability of a clinical infor- 
mation system to support patient-centered 
care. Attending physicians in the Trauma 
and Surgical ICUs established protocols for 
Intensive Insulin Therapy that were built into a 
CDSS to advise nurses on insulin doses based 
on a patient’s blood glucose and insulin resis- 
tance trends. In 94.4% of studied instances, 
nurses administered the recommended dose. 
When nurses overrode the recommended dose, 
they overwhelmingly administered less insu- 
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lin than the recommended dose, leading to a 
higher incidence of hyperglycemia than when 
the recommended dose was administered. 
Nurses appeared more concerned about hypo- 
glycemia than hyperglycemia and to consider 
the patient’s blood glucose but not the insulin 
resistance trend. They also noted that their 
workflow was impeded by the need to record 
information about the blood glucose, insu- 
lin dose, and primary dextrose source in two 
places—the CPOE that included the CDSS and 
the separate nurse charting system. The inves- 
tigators’ recommendations included display- 
ing information about insulin resistance trends 
on screen (provided that this did not produce 
clutter and confusion) and developing clinical 
information systems that do not require dou- 
ble documentation. Strengths of this example 
for supporting patient-centered care include 
the collaboration of physicians and nurses in 
maintaining blood glucose in the desired range 
based on patient data. Shortcomings include 
the failure to present nurses with information 
about the patient’s insulin resistance trend to 
aid their decision-making and the requirement 
that they record the same data in two places, 
thereby reducing time for direct patient care 
(Campion Jr et al. 2011). 

Patient-centered care systems not only 
support the cognitive work and communica- 
tions of clinicians; they also take into account 
the resources patients and families use to 
manage their health concerns. Increasingly, 
patients and family members engaged in pro- 
moting their own health turn to social net- 
working sites on the Internet. While research 
on the impact of social networking sites on 
patient-related outcomes is in its infancy, early 
research indicates that social networking sites 
are used by patients and family members to 
get informational and social support for day- 
to-day management of chronic illnesses such 
as diabetes and heart failure (Mogi et al. 2017; 
Partridge et al. 2018). Several studies indicate 
that the primary objectives for using social 
networking sites were to request information, 
to provide information to others with similar 
conditions, to express emotion about one’s 
own condition, to provide emotional support 
to others, and to promote a specific product 
(Greene et al. 2011; Mogi et al. 2017). Many 


health care professionals recognize the poten- 
tial of social networking sites for the peer sup- 
port that patients gain from participation, but 
they have concerns about the accuracy of the 
content that is offered on these sites. Available 
research indicates great variability in the accu- 
racy and effectiveness of the clinical content 
offered on social networking sites (Greene 
et al. 2011; Mogi et al. 2017). One study found 
that approximately one-third of the posts on 
health-related social networking sites could be 
classified as an advertisement of a non-FDA- 
approved remedy or cure (Greene et al. 2011). 
It was also noted that requests for personal 
information were not uncommon, potentially 
making participants vulnerable to solicita- 
tions from product manufacturers or vendors. 
Clinicians should be aware of the social net- 
working sites related to their area of expertise. 
Discussing social networking options with 
patients, including the advantages and disad- 
vantages and the potential benefits and prob- 
lems, can help patients to be more judicious 
consumers of social networking sites. 

In patient-centered care, personal health 
records (see ® Chap. 13) are often viewed as a 
means of communication between patients and 
providers and asa method of engaging patients 
in understanding and acting in the interests of 
their own health. A 2016 international study 
to understand perspectives on sharing data 
with patients categorized countries that are 
focused on encouraging patients to receive 
access to their clinical data, within the follow- 
ing stages of maturity: Established, Emerging, 
and Limited. Countries with “Established” 
levels of maturity for sharing patient data 
had been focused on this work since the early 
2000s and included Israel, England, Canada, 
Australia, and the United States (Prey et al. 
2016). Expanded efforts to share patient 
data led to “Emerging” status for Austria, 
Argentina, Brazil, the Netherlands, Portugal, 
South Korea, Switzerland, and Uruguay. 
Iran, Japan, and Kenya were described as 
focused on EHR implementation with “lim- 
ited” patient engagement and the potential 
for increased focus on patient engagement in 
the future. In 2013, ONC announced the Blue 
Button initiative, promoting patients’ legal 
rights to receive their personal health informa- 
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tion and recognizing sites that enable consum- 
ers to download their health records. A 2017 
ONC Data Brief reported that 52 percent of 
individuals have been offered online access to 
their medical record in the United States and 
over half of those individuals viewed their 
record within the past year, the equivalent of 
28 percent of individuals nationwide (Patel & 
Johnson 2018). 

Most of the clinical data being shared 
through portals is structured, coded informa- 
tion such as problem lists, allergies, medication 
lists, appointments, and laboratory results. 
Sharing of narrative notes is not widespread, 
but, as noted previously, the OpenNotes proj- 
ect has demonstrated the successful sharing 
of notes with patients. To date, over 28 mil- 
lion patients have online access to notes, and 
the network of providers participating in this 
project continues to grow.® 

Patient portals, which evolved as a mecha- 
nism to access personal health data and popu- 
late personal health record platforms, began 
to take hold in the 2000s. Several pilots during 
this time, such as the Military Health System’s 
MiCare portal in 2008, resulted in critical 
learnings and informed optimization of por- 
tal functionality over the following decade. 
Lessons learned include the following: (1) 
transfer of data upon specific patient request 
is more efficient than automatic transfer; (2) 
patient representatives prefer instant access 
to all their data, while many providers prefer 
an embargo time, particularly for release of 
sensitive results data; (3) inefficient provider 
access to personal health records and siloed 
repositories with incomplete information 
might pose the danger of ill-informed clinical 
decisions; and (4) giving patients the power 
to determine what medical information to 
share with the provider could similarly lead to 
clinical decisions made in the absence of vital 
information, with resulting harm to patients. 
The MiCare pilot concluded that while there 
is broad agreement on desired functional- 
ities for portals, challenging tensions remain 
between patients’ desire for access to and 
control of health information and providers’ 


8 » https://www.opennotes.org/ (Accessed 9/25/18). 
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needs for full information about the patient 
and for appropriate opportunities for ethical 
disclosure of information to patients. Over the 
following decade throughout the 2010s, core 
functionality for portals continued to expand 
and mature. Common features now include 
secure messaging, prescription renewals and 
refills, appointment requests online, online bill 
pay, referral requests, medication reminders, 
links to reference materials, use of low health 
literacy terms, and increasing access to several 
types of clinical data. Clinical data released in 
portals include routine and sensitive labora- 
tory results, genetic test results, medications, 
encounter information, allergies, immuniza- 
tions, radiology and other diagnostic reports, 
problem lists, discharge summaries, and clini- 
cal notes. In many organizations embargo 
periods for releasing results have lessened or 
been eliminated, although variability remains. 

As portal functionalities expanded and 
were optimized, acute care patient portals 
(portal access tailored to the hospital setting) 
began to emerge. Although some clinicians 
feared that patient portal access to data dur- 
ing a hospitalization would burden them with 
excessive patient inquiries, their fears were 
no more realized than were those of clini- 
cians wary of the OpenNotes project (Collins, 
Bavuso et al. 2017; Collins, Klinkenberg- 
Ramirez et al. 2017; Collins, Rozemblum et al. 
2017). User-centered design with patients and 
families and other key clinician and adminis- 
trative stakeholders specified that acute care 
portals could provide value by humanizing 
the patient-clinician connection, facilitating 
the maintenance and sharing of verbal com- 
munication, and promoting ubiquitous and 
equitable access to information. Key features 
specified for portals in the acute care setting 
are the provision of clinical data, messaging 
with clinicians, glossary of clinical/hospital 
terms, patient education resources, patient 
diary, patient notepad for reminders, resources 
that support family involvement, and tiered 
displays for information-dense clinical data. 
New clinical workflows are required to inte- 
grate portals within the acute care setting, and 
these are facilitated by active clinician engage- 
ment and demonstration of improved patient 
outcomes and satisfaction (Collins, Bavuso et 
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al. 2017; Collins, Klinkenberg-Ramirez et al. 
2017; Collins, Rozemblum et al. 2017). Patient 
portals are increasing, but not yet in routine 
use, in the acute and post-acute setting. 

In 2018, 7 key focus areas were identified 
for sociotechnical and evaluation research 
related to patient engagement and portal 
use in the acute and post-acute care settings. 
These identified research areas were (1) 
standards for interoperability, functional- 
ity, and patient-driven use case models; (2) 
appropriate access and policies for privacy 
and security; (3) user-centered design; (4) 
implementations that integrate with work- 
flows for sustainable adoption; (5) data and 
content management and visualizations 
with inclusion of novel data sources (e.g., 
multimedia tools, safety reporting plat- 
forms, social determinants of health [SDOH] 
data); (6) CDS for patients and care part- 
ners; and (7) systematic evaluation of pro- 
cess, balance, and outcomes measurement 
(Collins, Dykes et al. 2018). 

Importantly, patients expect that patient 
portals across ambulatory and acute care set- 
tings are seamless and integrated for ease of 
access within the patient’s control. Seamless 
data access within the patient’s control is not 
yet broadly realized, but several technical 
and policy initiatives offer promising paths 
that were not possible previously. The federal 
21st Century Cures Act (Cures Act) includes 
provisions to improve patients’ access to their 
health data and simplification of the patient’s 
ability to electronically share their informa- 
tion. Aligned with those provisions, there are 
ongoing interoperability efforts with a pri- 
mary focus on connecting mobile health apps 
and devices to EHRs using open application 
programming interfaces (APIs) to allow indi- 
viduals to collect, manage, and share their 
health information. ONC, aligned with the 
Cures Act, promotes policy choices that will 
give consumers, clinicians, and innovators 
more options for getting to and using health 
information with specific HIT certification 
criteria that call for the development of mod- 
ern APIs that do not require “special effort” 
to access and use (Rucker 2018). ONC pro- 
motes the use of the FHIR standard for rep- 
resenting clinical data with APIs. 


Electronic health records and other 
computer-based information resources can 
influence the provision of patient-centered 
care even when the patient and the provider 
are in the same room. A study of computer 
use during acute pediatric outpatient vis- 
its found that female physicians were more 
likely than males to be communicating with 
patients and families while using the com- 
puter (Fiks et al. 2011). A recent study in 
ambulatory care revealed that although pro- 
viders reported improvements attributable 
to EHRs (e.g., communication between pro- 
viders, review of results with patients, and 
review of follow-up to testing results with 
patients), they perceived a negative effect 
on patient-provider connection (Sandoval 
et al. 2016). An observational study involv- 
ing 20 primary care physicians and 141 of 
their adult patients showed how the inclu- 
sion of the computer in the clinical consul- 
tation can help patients shift the balance of 
power and authority toward shared decision- 
making and patient-centered care (Pearce 
et al. 2011). This Australian study found 
that about one-third of the patients actively 
included the computer as a party to the con- 
sultation, drawing the physician’s attention 
to it as a source of information or author- 
ity. They concluded, “Jn the future, computers 
will have greater agency, not less, and patients 
will involve themselves in the three-way con- 
sultation in more creative ways—for example, 
through online communication, or through the 
plugging into computers of their own electronic 
records, creating a situation where they co-own 
the information in the computer .... By democ- 
ratizing and commoditizing information flows 
and authority in the consultation, we may in 
fact create truly patient-centered medicine, 
with the patient directing the action” (p. 142). 

As these examples illustrate, the com- 
plexity of collaborative, interdisciplinary, 
patient-centered care poses serious chal- 
lenges to the design of clinical informa- 
tion systems. Many systems fall short in 
supporting cognitive work, even from a 
clinician-centric perspective. Supporting 
communications among clinicians, between 
clinicians and patients, and among patient 
and family support groups presents myr- 
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iad technical and ethical problems. Still, 
researchers and clinicians increasingly share 
a vision of patient-centered care that drives 
them to push the frontiers and develop sup- 
port for this emerging model of care. 


17.5 Outlook for the Future 


Social and political forces have begun to 
transform health care in the United States, 
and HIT is advancing to support the 
changes. The transformation is rapid, dis- 
ruptive, and not always smooth, but man- 
dates and incentives are aligning with social 
and economic imperatives to maintain prog- 
ress. 

To meet demands for patient-centered 
care, changes must occur in clinician prac- 
tice patterns and processes, in the organiza- 
tion and management of health services, and 
in the education of health care professionals 
and the public. To support patient-centered 
care, clinicians, informatics professionals, and 
computer scientists must develop health infor- 
mation and communication technologies that 
support collaboration; cognitive processes and 
operational workflow; communication and 
shared decision making between and among 
clinicians, patients, and family members; and 
trustworthy tools for the management of per- 
sonal and family health. 

Transformational change is daunting, and 
resistance is inevitable. Still, the chances for 
success have never been better. The vision 
of health care articulated by the National 
Academies of Sciences is guiding policy, 
research, and practical action by govern- 
ment agencies, health care providers, and the 
public. 
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® Questions for Discussion 

1. What is the utility of a linear model of 
patient care as the basis for a decision- 
support system? What are two primary 
limitations? Discuss two challenges 
that a nonlinear model poses for 
representing and supporting the care 
process in an information system. 

2. Compare and contrast additive, 
clinician-centered versus coordinated, 
patient-centered models of interdisci- 
plinary patient care. What are the advan- 
tages and disadvantages of each model 
as a mode of care delivery? What are the 
broad implications for design of infor- 
mation systems to support clinician- 
centered versus patient-centered models 
of care? 

3. Imagine a patient-care information sys- 
tem that assists in planning the care of 
each patient independently of all the 
other patients in a service center or 
patient-care unit. What are three advan- 
tages to the developer in choosing such 
an information architecture? What 
would be the likely result in the real 
world of practice? Does it make a differ- 
ence whether the practice setting is hos- 
pital, ambulatory care, or home care? 
What would be the simplest information 
architecture that would be sufficiently 
complex to handle real-world demands? 
Explain. 

4. Zielstorff et al. (1993) proposed that 
data routinely recorded during the pro- 
cess of patient care could be abstracted, 
aggregated, and analyzed for manage- 
ment reports, policy decisions, and 
knowledge development. What are three 
advantages of using patient care data in 
this way? What are three significant limi- 
tations? 


5. Over the past decade, many of the 
patient-care information systems 
designed in the 1970s have been replaced 
by vendor-based systems. What role 
have public policy and payment models 
had in driving this change? How do the 
practice models, payer models, and 
quality focus of today differ from those 
of the past? What differences do these 
changes require in information systems? 
What are two advantages and two disad- 
vantages of the older, internally devel- 
oped legacy systems 
vendor-provided systems? 

6. What challenges exist in modeling 
information for patient-centered care? 
What considerations are important in 


versus 


designing patient-facing health 
information and communication 
technologies? 
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© Learning Objectives 

After reading this chapter you should know 

the answers to these questions: 

= What are the three core functions of 
public health, and how do they influ- 
ence informatics requirements to 
achieve public health goals? 

= What is the difference and relationship 
between public health and population 
health? 

= What are the differences between public 
health informatics and other informat- 
ics specialty areas? 

= What are the ways populations can be 
defined and how does that impact pub- 
lic health informatics? 

= What are the categories and specific 
characteristics of informatics systems 
that are typically deployed in public 
health? 

= What are the variations in the types of 
public health information systems 
needed at a local, regional, state, or 
national level? 

= What factors influence the use of immu- 
nization information systems (IIS) and 
how can this model apply to other areas 
of the health system? 

= What are some of the characteristics 
and factors that allow a public health 
information system to work in other 
countries but not in the U.S.? 


18.1 Chapter Overview 


The science and practice of Biomedical 
Informatics supports public health in its 
efforts to promote the health of populations, 
prevent disease and unhealthy exposures and 
behaviors, and protect populations exposed 
to human-caused or natural disasters. To 
optimize population health, one must address 
factors beyond the genetic and biologic make- 
up of individuals, such as the environment, 
behaviors, socio-economic status, occupation, 
access to care, and other influencers of health 
status. Although much of the variation in 
health status can be attributed to the zip code 
of one’s residence, (Dwyer-Lindgren et al. 


2017) behaviors (e.g., smoking and physical 
activity) are root determinants for the most 
common causes of death (Schroeder 2007). 
Public health measures leading to improved 
access to safe water and sanitation, nutrition, 
immunizations, and preventive care (particu- 
larly for pregnant women and children) are 
responsible for 25 of the 30 years gained in life 
expectancy in the US during the 20th Century 
(Bunker et al. 1994). Thus, effective improve- 
ment of the health status of populations 
requires the effective application of informat- 
ics strategies beyond the clinical care setting. 
In this chapter, we first briefly describe 
public health science, the differences between 
“population health” and “public health,” and 
explain key differences between clinical and 
public health practice that influence the needs 
and requirements for informatics-related 
interventions. Next, we define “population and 
public health” informatics as the systematic 
application of informatics methods and tools 
to support public health goals and outcomes, 
regardless of the setting. Finally, we describe 
specific example systems and applications that 
illustrate key challenges and opportunities. 


18.2 What Is Public Health? 


Public health is a complex discipline focused 
on promoting and protecting the health of 
people and communities where they live, 
learn, work and play (American Public 
Health Association (APHA) 2018'). Public 
health practice is guided by social justice and 
the needs of all persons within a population, 
not simply those accessing healthcare deliv- 
ery systems. While medical care focuses on 
the detection, treatment, and management 
of injury and disease, public health practice 
and research involves a broad array of dis- 
ciplines and diverse activities with an over- 
arching emphasis on primary prevention, 
intervening at the earliest possible place in the 


1 American Public Health Association, What is public 
health; Retrieval 10/12/2018: » https://www.apha. 
org/what-is-public-health 
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causal chain leading to disease or disability. 
Prevention activities span improved access to 
safe food, clean water, air, and sanitation, vac- 
cines, safe roadways and workplaces, and so 
forth - all in an effort to improve the health of 
communities, however defined. Public health 
achievements have been associated with major 
gains in life expectancy (CDC 19997), and 
investment in disease prevention can yield sig- 
nificant cost savings and a healthier and less 
costly life (Trust for America’s Health 20203). 
Despite these achievements, global public 
health is challenged by the increasing mobility 
of populations and ongoing threats to secu- 
rity and safe environments which can result in 
regional outbreaks becoming pandemics (e.g. 
COVID-19). 

It is useful to conceptualize public health 
in terms of three core functions: assessment, 
policy development, and assurance (Institute 
of Medicine (IOM) 1988). Assessment 
involves monitoring and tracking the health 
status of populations including identifying 
and controlling disease outbreaks. By relat- 
ing health status to a variety of demographic, 
geographic, environmental, and other factors, 
it is possible to develop and test hypotheses 
about the etiology, transmission, and risk fac- 
tors that contribute to health problems in a 
population and to develop and implement 
control strategies that contribute to improve- 
ments in population health. 

Policy development, the second core func- 
tion of public health, uses the results of 
assessment activities and etiologic research 
in concert with local resources, values and 
culture (as reflected via citizen input) to rec- 
ommend public policies and interventions 
that improve health status. For example, the 


2 Centers for Disease Prevention and Control. (1999). 
Ten great public health achievements — United 
States, 1900-1999. MMWR;  48(12);241-243. 
Retrieval 10/02/2018  » https://www.cdc.gov/ 
mmwr/preview/mmwrhtml/00056796.htm 

3 Trust for America’s Health. (2020). Prevention for a 
healthier America: investments in disease preven- 
tion yield significant savings, stronger communities. 
Retrieval 1/25/2020: » http://www.tfah.org/wp-con- 
tent/uploads/2020/04/TFAH2020PublicHealth- 
Funding.pdf 
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relationship between fatalities in automobile 
crashes and ejection of passengers from vehi- 
cles led to recommendations, and eventually 
laws, mandating seat belt use which contrib- 
uted to a subsequent decrease in morbidity 
and mortality from automobile crashes. 

Advances in information technology 
and widespread use of the internet, includ- 
ing social media sites and on-line discussion 
forums, as well as use of mobile apps, provide 
new opportunities for public health policy 
development. Given that public health is pri- 
marily a governmental activity, it depends 
upon and is informed by the consent of those 
governed. Policy development in public health 
is (or should be) based on science, but it is also 
guided by the values, beliefs, and opinions of 
each society it serves. Public health officials 
who wish to promote certain healthy behav- 
iors, or to promulgate regulations (e.g., con- 
cerning fluoridated water, e-cigarettes, bicycle 
helmets, social distancing and so forth) would 
do well to tap into the online marketplace of 
ideas—both to understand the opinions and 
beliefs of their citizenry, and to inform and 
influence citizens to engage in those healthy 
behaviors. 

Assurance, the third core function of pub- 
lic health, refers to the duty of public health 
agencies to assure their constituents that ser- 
vices necessary to achieve agreed upon goals 
are available. The services in question (includ- 
ing medical care) may be provided directly by 
the public health agency or by encouraging or 
requiring (through regulation) other public 
or private entities to deliver the services. For 
example, in some communities, local public 
health agencies provide direct clinical care 
to underserved or at-risk populations. The 
health department in Multnomah County, 
Oregon follows this model and offers health 
care services in multiple primary care clin- 
ics, schools, community sites and in people’s 
homes. In other communities (e.g., Tacoma- 
Pierce County, Washington), local public 
health agencies have sought to minimize or 
eliminate direct clinical care services, instead 
working with and relying on community part- 
ners to provide such care. While there is great 
variation across jurisdictions, the fundamen- 
tal function is unchanged: to assure that all 
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members of the community have adequate 
access to needed services, especially preven- 
tive care services and testing and diagnostic 
services in the context of an outbreak such as 
COVID-19. 

The assurance function is frequently asso- 
ciated with clinical care, but also refers to 
assurance of the conditions that allow people 
to be healthy and free from avoidable threats 
to health—which include access to clean 
water, a safe food supply, responsive and 
effective public safety entities, and so forth. 

This “core functions” framework is useful 
for describing the fundamental, overarching 
responsibilities of public health. The three 
core functions are operationalized through 
a set of ten essential public health services 
(0 Table 18.1) (Department of Health and 
Human Services (DHHS) 1994+). Although 
there is great variation in capacity to imple- 
ment the ten services, they represent types of 
activities that public health agencies use to 
achieve their mission to assure conditions in 
which people can be healthy. 

Whether one views public health through 
the lens of the three core functions or the ten 
essential services, managing and using infor- 
mation is a fundamental activity for public 
health effectiveness. For example, assessment, 
and several of the essential public health 
services, rely heavily on public health sur- 
veillance, the ongoing collection, analysis, 
interpretation, and dissemination of data to 
guide public health actions. The data may 
concern health conditions (e.g., breast cancer, 
communicable diseases, obesity), threats to 
health (e.g., smoking prevalence, drug over- 
dose), healthcare delivery and quality (e.g., 
immunization rates or reports of health sys- 
tem quality monitoring), healthcare capacity 
(e.g., availability of immunization or medi- 
cations, emergency or intensive care services, 
or other critical needs for delivering required 
care for a population), or other events (e.g., 
births) to guide public health action. 


4 Department of Health and Human Services. (1994). 
Essential public health functions. Public Health in 
America. Retrieval 08/12/2018: » http://www. 
health.gov/phfunctions/public.htm 


O Table 18.1 Ten essential services of public 
health (DHHS 1994) 


1. Monitor the health status of individuals in the 
community to identify community health 
problems 


2. Diagnose and investigate community health 
problems and community health hazards 


3. Inform, educate, and empower the community 
with respect to health issues 


4. Mobilize community partnerships in identifying 
and solving community health problems 


5. Develop policies and plans that support 
individual and community efforts to improve 
health 


6. Enforce laws and rules that protect the public 
health and ensure safety in accordance with those 
laws and rules 


7. Link individuals who have a need for community 
and personal health services to appropriate 
community and private providers 


8. Ensure a competent workforce for the provision 
of essential public health services 


9. Research new insights and innovate solutions to 
community health problems 


10. Evaluate the effectiveness, accessibility, and 
quality of personal and population-based 
health services in a community 


Source. Department of Health and Human Ser- 
vices. (1994). Essential public health functions. 
Public Health in America. Retrieval 08/12/2018: 
> http://www.health.gov/phfunctions/public. 
htm 


Public health surveillance data are often 
used to define priorities for public health 
actions, either to guide a public health 
response or policy development. Surveillance 
data may serve short-term needs (e.g., to 
respond to an acute infectious disease out- 
break or pandemic such as COVID-19) or 
longer-term needs (e.g., to determine leading 
causes of premature death, injury, or disabil- 
ity), and are increasingly more available for 
querying and visualization through state and 
federal public health web sites (e.g., data.gov). 
Surveillance data are used by epidemiologists 
and researchers and can impact public under- 
standing of health threats. For example, data 
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used to manage the COVID-19 pandemic or 
data used to visualize the increasing preva- 
lence of obesity in the U.S. over time both 
contributed to the tremendous energy and 
public focus brought to bear on these prob- 
lems. Similarly, mortality data has been critical 
for understanding the evolving drug overdose 
epidemic in the U.S. (Seth et al. 2018). As is 
often the case though, no single data system 
provides all information required to appropri- 
ately tailor the public health response, partic- 
ularly ata local level. For example, in addition 
to mortality data, more timely and compre- 
hensive nonfatal and fatal overdose data are 
needed; therefore, other systems (such as bio- 
surveillance, syndromic surveillance systems 
or an unintentional drug overdose reporting 
system) can be used to identify overdoses 
and emerging threats in local communities or 
improve collection of toxicology data to iden- 
tify specific drugs involved (Seth et al. 2018). 

While the core functions of public health 
described in the IOM framework have not 
changed for many years, rapid advances in 
technology and sources of data are changing 
the practice of public health. For example, 
there are (a) new data sources and methods 
to assess and understand the prevalence of 
disease in communities, the impact of public 
health response actions (e.g. contact tracing or 
stay at home orders associated with COVID- 
19), and the health status and determinants 
of disease in populations, (b) improved ana- 
lytical and visualization software (e.g., geo- 
graphic information systems (GIS)), and (c) 
improved ability to integrate and/or share 
health data across systems (Overhage et al. 
2008). Informatics is therefore a foundational 
science for public health practice. 


Public Health Versus 
Population Health 


18.2.1 


The phrase “population health” is increas- 
ingly used by researchers, practitioners, and 
policymakers in health care, public health, 
and other fields (Stoto 2013). For the purpose 
of conceptualizing population versus public 
health informatics, a helpful working defini- 
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tion for population health was proposed by 
Kharrazi et al. (2017): 


» Population health comprises organized activi- 
ties for assessing and improving the health 
and well-being of a defined population. 
Population health is practiced by both private 
and public organizations. The target “popula- 
tion” can be a specific geographic community 
or region, or it may represent some other 
“denominator,” such as enrollees of a health 
plan, persons residing in a provider's catch- 
ment area, or an aggregation of individuals 
with special needs. The difference between 
population health and public health is subtle, 
and there is not always a full consensus on 
these definitions. That said, public health ser- 
vices are typically provided by government 
agencies and include the “core” public health 
functions of health assessment, assurance, 
and policy setting. In the United States, most 
actions of public health agencies represent 
population health, but a considerable propor- 
tion, if not the majority, of population health 
services are provided by private organiza- 
tions. (Kharrazi et al. 2017). 


Public health is typically focused on popula- 
tions defined by a specific geographic com- 
munity or region (@ Fig. 18.1). It may also 
leverage healthcare systems to implement 
strategies that meet public health goals such 
as the reporting and management of infec- 
tious diseases or administration of vaccine. 

While “population health” means dif- 
ferent things to different groups, it is always 
based on the underlying assumption that mul- 
tiple common factors impact the health and 
well-being of specific populations, and that 
focused interventions early in the causal chain 
of disease will prevent morbidity and mortal- 
ity and may also save resources. 

In the context of health care reform, a pop- 
ulation perspective has led to increased efforts 
to incorporate social and other determinants 
of health into medical care practice. This may 
include documenting this information in the 
electronic health record (EHR) in order to 
improve clinical decision making, and to bet- 
ter understand the health status of the com- 
munity served. This has also led healthcare 
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O Fig. 18.1 The definition of a 
“population” varies according to 
need. (Courtesy of Catherine 
Staes, PhD, MPH, RN) 


organizations to implement innovative strate- 
gies to improve health and reduce costs, such 
as providing housing, air conditioners, and/or 
transportation. 

In the context of public health, a popula- 
tion perspective has always existed. However, 
now there are new opportunities to leverage 
data sources not traditionally used by pub- 
lic health (e.g., social media) and to use the 
information in EHRs to meet public health 
goals. For example, an EHR-based clinical 
decision support system (CDSS) and qual- 
ity monitoring can be used to promote and 
monitor public health goals (e.g., improved 
cancer screening and immunization coverage, 
or early detection of health threats such as 
lead in water or an individual exposed to an 
infectious disease) among persons accessing 
clinical care. 


18.3 Public Health Informatics 


Public health informatics was first defined as 
the “systematic application of information 
science, computer science, and technology to 
public health practice, research, and learn- 
ing” (Friede et al. 1995; Yasnoff et al. 2000). 
It is distinguished by its focus on populations 
(versus the individual), its orientation to pre- 
vention (rather than diagnosis and treatment), 
and its governmental context because it nearly 
always involves government agencies. It is a 
complex domain that is the focus of another 


A population can be defined as: 


A specific geographic community 
or region (e.g., State of Utah) 


Enrollees of a health plan (e.g. 
members are a subset of residents) 


0 Ot 


Persons residing in a health 
system’s catchment area (e.g., 
location of targeted population) 


A An aggregation of individuals with 
specific conditions (e.g., cancer) 


entire textbook in this series (Magnuson and 
Dixon 2020). 

The Centers for Disease Control and 
Prevention (CDC) has characterized public 
health informatics as developing and deploy- 
ing methods for achieving a public health 
goal faster, better, or at lower cost by lever- 
aging computer science, information science, 
or technology (Savel and Foldy, 2012). Public 
health systems are implemented across a wide 
range of settings (large and small, urban 
and rural) with variable infrastructure and 
capabilities, and for a workforce with a wide 
range of informatics experience and skills and 
access to technical resources and support. 
Given this complex context, we define public 
health informatics as: 


» The systematic application of informatics 
methods and tools to support public health 
goals and outcomes, regardless of the setting. 


The differences between public health 
informatics and other informatics specialty 
areas parallel the contrast between public 
health and medical care itself. Public health 
focuses on the health of the community as 
opposed to that of the individual patient. In 
the medical care system, individuals with spe- 
cific diseases or conditions are the primary 
concern. In public health, the information 
and unit of analysis often relates to the com- 
munity and may, for example, include sharing 
of information (such as disclosure of the dis- 
ease status of an individual) to prevent further 
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spread of illness or isolating individuals to 
protect others. In addition, information about 
environmental and other factors (e.g., air 
and water quality, animal health, etc.) is also 
part of the public health domain. Finally, the 
focus on prevention and assessing health sta- 
tus across a population, rather than respond- 
ing to diagnosis and treatment of individuals, 
necessitates the use of standards for health 
information exchange and large-scale analysis 
of data across multiple health systems. 


18.4 Examples of Public Health 
Informatics Challenges 
and Opportunities 


This section provides a high-level overview of 
the scope and function of information systems 
that support public health practice to illus- 
trate the value (and challenges) of informat- 
ics methods and tools for optimizing public 
health outcomes. Public health practice and 
epidemiology have always been data-intensive 
endeavors; however, new opportunities to 
apply informatics-based strategies have arisen 
with the advances in computer technologies, 
increased use of EHRs and social media, and 
new techniques for mining data, delivering 
decision support, processing natural language, 
and system architectures that use standards to 
support interoperability. When considering 
how to apply informatics to address a pub- 
lic health need, it is important to understand 
the current “business” of public health and 
the history of systems that have successfully 
achieved their goals. 

We first present an overview of the array 
of heterogeneous systems and applications 
used in the U.S. across the approximately 
3000 local and 56 state and territorial health 
departments, and at the federal public health 
level, and describe informatics opportunities 
and challenges. Second, we focus on a robust, 
nationwide public health system built on 
informatics principles: immunization regis- 
tries. Third, we describe a global public health 
informatics challenge and illustrate how suc- 
cessful informatics solutions can be applied in 
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the context of the community/country where 
they are implemented. 


18.4.1 Overview of Public Health 
Information Systems 
in the U.S. 

= Context 


The fundamental science of public health is 
epidemiology, which is “the study of the dis- 
tribution and determinants of health-related 
states or events in specified populations, and the 
application of this study to the control of health 
problems” (Last 2001). As a consequence, pub- 
lic health information systems may collect, use 
and report data at the level of an event, a per- 
son, or a population, and may address a broad 
spectrum of topics along the causal chain 
of factors that impact health status. Public 
health efforts focus early in the causal chain 
to affect outcomes (Stiefel and Nolan 2012). 
For example, public health information sys- 
tems may monitor exposures and risk factors, 
health events (e.g., vaccinations), persons with 
injuries or infectious or chronic disease, and 
other topics relevant for (a) understanding 
public health threats, and (b) designing and 
evaluating interventions. These information 
systems must support prevention and control 
efforts targeted at individuals and populations 
and, typically, also allow aggregate analysis to 
describe populations. For example, during the 
COVID-19 pandemic, public health authori- 
ties monitored laboratory tests and reported 
positivity rates and total tests performed, per- 
sons infected and hospitalized, and indicators 
of healthcare system capacity to respond to 
medical needs in a community. In contrast, 
nearly all medical information systems focus 
exclusively on supporting the processes of 
care for individuals. For example, EHRs and 
clinical laboratory systems are optimized so 
clinicians can quickly identify lab results for 
a specific individual, whereas public health 
practitioners want information about the 
patterns of antibiotic resistance over time 
and across multiple clinics relevant for the 
population in their jurisdiction. Public health 
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information systems should be optimized to 
collect relevant data, visualize and recognize 
epidemiology-relevant patterns, and identify 
emerging threats. 

The “business” of public health involves a 
heterogeneous range of activities performed 
by a broad range of settings, including pri- 
vate but mostly government-based organiza- 
tions. In the U.S., public health is organized 
across different levels of government, with 
each level having its own unique role. In a 
very general sense, information at the local 
level is required to take action in response 
to individual events (thus requiring personal 
identifiers) and to implement policies that 
impact populations, while information at the 
state level shifts towards monitoring events 
and supporting public health programs, and 
finally, information at the federal level focuses 
on national-level monitoring, policy develop- 
ment, funding of public health activities, and 
regulatory compliance (thus not requiring 
personal identifiers). In contrast, most other 
countries around the world have public health 
systems that are more centralized, which 
reduces barriers to information sharing and 
population-level analytics. 


= Systems at different levels of government 

in the U.S. 
Local (City/County/Tribal) Public Health 
Practice 

In the U.S., there are approximately 3000 
city and county local health departments ded- 
icated to promoting and protecting the health 
of people and animals in their community, 
and additional tribal jurisdictions each with 
their own public health organizations. At the 
local level, the work of public health often 
involves direct interaction with individuals 
or businesses in the community. Local health 
departments (LHDs) collect data to serve 
the needs of individuals, e.g., to support cli- 
ent follow up in LHD clinics. It is typical for 
local health department staff (e.g., a public 
health nurse, epidemiologist, or environmen- 
tal health specialist) to investigate and gather 
identifiable person-specific information. This 
information is needed to ensure that infected 
persons are properly treated and that exposed 


individuals receive prophylaxis or follow- 
up, or that control measures are properly 
implemented particularly in settings where 
transmission may occur (e.g., in day care, long 
term care or food service settings). Unlike 
clinical systems, the systems used by LHD 
staff must support contact tracing, identifi- 
cation of exposed individuals, and manage- 
ment of isolation, follow-up testing and other 
control measures. During outbreaks of highly 
contagious infections such as measles or 
COVID-19, these systems are critical for sup- 
porting LHD activities. In addition, LHDs 
need to summarize data across their jurisdic- 
tion. Each LHD operates under the legal and 
policy framework of their respective state or 
territorial health department. 

Information systems at the local public 
health level vary tremendously, depending on 
the size of the jurisdiction, funding levels, and 
the activities the local agency is required to per- 
form. For example, the scope of practice for a 
rural health department with four staff mem- 
bers is very different from the multi-thousand 
employee New York City Health Department. 
In view of this variation, it is not surprising 
that information systems also range from sim- 
ple spreadsheets to complex electronic record 
systems. The effective application of informat- 
ics principles to functions and data is currently 
often limited, but the ecosystem is advancing. 
For example, it is becoming more common for 
state-based surveillance systems to be web-based 
and support local information needs as well. For 
example, the systems to manage persons with 
sexually transmitted infections, tuberculosis, 
or COVID-19 may be state-wide systems that 
allow access for local public health staff to per- 
form case management and investigations. 


State Public Health Practice 

In the U.S., there are 56 state and territo- 
rial health departments charged to carry out 
the responsibility of the laws and policies of 
their respective jurisdictions. Each state and 
territory has a health officer who usually 
reports to the governor and is tasked with 
leading the health agenda. The business of 
public health is data intensive and the systems 
currently used have evolved one by one over 
time based on needs and funding availability. 
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State health department information sys- 
tems are often dedicated to a particular dis- 
ease or condition such as infections disease, 
cancer, or injury. In Minnesota, for example, 
there are at least 21 such information systems 
that maintain individual level information and 
exchange information with hospitals, clinics 
and other health settings in the community 
(@ Table 18.2). The systems presented in the 
table are common but not representative of 
all states. For example, some states have inte- 
grated communicable disease surveillance 
systems, while others continue to operate sep- 
arate systems for HIV/AIDS or TB surveil- 
lance, depending on state laws and funding. 
In addition, depending on the relationship 
between the state and its local (city/county/ 
tribal) public health agencies, the systems may 
be the same, integrated, or separate. Finally, 
public health agencies have other systems as 
well (e.g., for electronic health information 
exchange (HIE)), so this list is not compre- 
hensive, but rather illustrates the diversity 
of systems and the kinds of data that can be 
made available for national aggregation. 

The systems listed in @ Table 18.2 are 
categorized as monitoring, workflow man- 
agement, or case/care management systems, 
based on their primary function and key fea- 
tures. Monitoring systems typically rely on 
clinical or laboratory data as their source of 
information and attempt to aggregate data 
across health systems. They are often depen- 
dent on providers to identify the records that 
need to be shared and on evolving health data 
and interoperability standards to efficiently 
and accurately share the information used to 
generate a population-level view. The events 
may differ (e.g., a birth, a ‘case’ of measles 
or COVID-19, a birth defect, any blood lead 
level result, a health claim, etc.) but the event 
detection, information summarization, and 
reporting processes are more similar than 
different. In contrast, the workflow manage- 
ment systems are internally-focused on sched- 
uling and collecting information, but may 
require interactions with external systems to 
process requests or reports. While such sys- 
tems are common in many industries, the 
health and personal data managed in public 
health impose additional security and privacy 
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requirements. Finally, the case and care man- 
agement systems are similar in functionality 
to outpatient EHRs where persons (either 
effected by a condition or exposed to a health 
threat) must receive diagnostic testing, pre- 
ventive services, treatment, and management 
to ensure ongoing monitoring and/or treat- 
ment to prevent further spread in the com- 
munity. 

Before planning to develop a new public 
health information system, one should first 
determine which of these three fundamental 
system categories is required as a prerequisite 
to deciding whether modifying an existing sys- 
tem or building/buying an entirely new one can 
most effectively address the need. Establishing 
the system type also enables the development 
team to more effectively seek out and learn 
from the experience of those who have previ- 
ously built or managed similar systems. 

The systems are often designed with special 
features for population-level analysis and con- 
text, with multiple variables indexed, to sup- 
port sophisticated statistical and Geographic 
Information System (GIS) support capabili- 
ties. The system may be optimized for retrieval 
from very large (multi-million) record data- 
bases, and to quickly cross-tabulate data to 
study seasonal and secular trends and look for 
patterns by person, place, and time. 


National Public Health Practice 

There are numerous federal-level public 
health agencies that directly and indirectly 
support and fund public health activities at 
the local and state level, provide direct clini- 
cal services (e.g., through the Indian Health 
Service or other federal agencies), provide 
specific services in response to critical events 
(e.g., in response to bioterrorism threats or 
events that occur offshore), perform regula- 
tory oversight, or aggregate information from 
across the U.S. to provide nationwide infor- 
mation and guidance for policy development. 
The agencies supporting the public health 
mission range from the CDC, Food and 
Drug Administration (FDA), Environmental 
Protection Agency (EPA), Department 
of Agriculture, Consumer Product Safety 
Commission, and Occupational Safety and 
Health Administration, among others. These 
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© Table 18.2 Sample of information systems used by the Minnesota State Department of Health, 


classified by primary function and key features 


Sample of systems 


Sample key features 


Monitoring systems for surveillance of knowledge, attitudes, or health events that may impact health 


Vital Statistics System tracking births and 
deaths? 


Immunization Information System? 
Cancer Surveillance System? 
Birth Defects Information System? 


Sudden Unexplained Infant Death/Sudden Death 
in Youth Program? 


Traumatic Brain and Spinal Cord Injury System? 


Communicable disease surveillance systems: 
Infectious Disease Surveillance 
Sexually Transmitted Infections (STD/SDI) 
Surveillance 
AIDS/HIV Surveillance 


Drug Overdose and Substance Abuse 
Surveillance System? 


Blood Lead Information System* 
All-payer claims database* 

Injury surveillance? 

Antibiotic Resistance and Stewardship* 
Animal health surveillance* 

Syndromic surveillance? 


Breast/Cervical Cancer Screening quality 
monitoring? 


Environmental monitoring of air & water? 


Surveysb: 
Behavioral Risk Factor surveillance system 
(BRFSS) 
National Health and Nutrition examination 
survey (NHANES) 


Process Control systems to manage workflow 
Public Health Laboratory Information System 
Women, Infants and Children (WIC) 
Food-service inspection system 

Vector control operations management system 
Vaccine distribution system 


Medical Cannabis Information System 


Data collection may occur (a) in response to a triggered 
event’, or (b) using a pre-planned sampling strategy? 
After a triggering event occurs, a set of data is transferred 
to a public health agency or an organization acting on the 
behalf of public health 
Triggered data: 

Often originates in clinical systems 

May be individual events or summarized reports 

May be reported at intervals ranging from near 

real-time to hourly, daily, monthly, etc. 
Often, systems must be able to associate the data with 
other potentially-related events. 


Internal systems with ‘industry-standard’ functionality 
May interact with external systems to receive requests and 
report out information 
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O Table 18.2 (continued) 


Sample of systems 


Clinical care and case management systems 


Public Health Clinics for Targeted Services: 
Sexually transmitted disease clinic 
Immunization clinic 

Public Health Contact Management Systems 
Sexually transmitted disease contacts 
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Sample key features 


Person-based electronic health records to manage care of 
persons with specific needs 

Manage screening and follow-up of identified populations 
to ensure appropriate care, either directly provided or 
referred 


Tuberculous contacts 
COVID-19 contacts 


Children with Special Health Needs System 
Tuberculosis Control System 


Newborn Hearing Screening/Early Hearing 
Detection and Intervention 


Refugee/Immigrant Health Information System 


‘Data collection occurs in response to a triggered event 
bData collection occurs using a pre-planned sampling strategy 


agencies may or may not have a regulatory 
role, but they illustrate the diversity and com- 
plexity of public health at the federal level. 
They all require some level of cross-state 
aggregation of information and typically do 
not gather person-specific data. 

In the U.S., public health data is processed 
via distributed information systems with min- 
imal aggregation at the federal level. In fact, 
it is only reportable conditions such as infec- 
tious diseases, births, and deaths that are uni- 
formly and relatively completely reported on 
a national basis by the CDC. The U.S. lacks a 
unifying public health information infrastruc- 
ture. Thus, the U.S. relies on state and local 
health departments to develop and use inde- 
pendent information systems to support their 
public health needs. This presents significant 
challenges, as described in the next section. 
In contrast, France, Great Britain, Denmark, 
Norway and Sweden have comprehensive 
systems in selected areas, such as occupa- 
tional injuries, infectious diseases, and cancer. 
No country, however, has complete reporting 
for every problem, but many are often able 
to deliver timely answers to important public 
health questions by having information on the 
entire population. 


= Informatics Challenges and Opportunities 
U.S. Public Health Information Infrastructure 

The inadequate national infrastructure to 
detect and respond to public health threats 
and support the management of local, state 
and national prevention and control pro- 
grams are amplified by the COVID-19 pan- 
demic with its devastating morbidity and 
financial impact. The current infrastructure 
in the U.S. has a patchwork of diverse infor- 
mation systems due, in part, to siloed and 
short-term funding and the U.S. Constitution 
which is silent on the responsibility for pro- 
tecting the public’s health, leaving state law 
to govern public health actions. Federal 
leadership can provide overarching structure 
and resources, consolidate information, and 
present the rationale for a unified approach. 
However, each state functions independently 
to deal with the multiplicity of response, man- 
agement and recovery decisions presented by 
health threats. 

The grand challenge for public health 
is to create and maintain a unified public 
health information infrastructure based on 
informatics principles that supports multi- 
jurisdictional (local, state, federal) and cross- 
jurisdictional needs, and seamlessly interacts 
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with other relevant information systems such 
as those containing clinical, laboratory or 
environmental data (see >» Chap. 15). 

High level informatics characteristics and 
considerations for such a public health infor- 
mation infrastructure include the ability to: 
= Support both routine and emergent public 

health functions using the same 

infrastructure, avoiding the development 
of systems that compete for resources in 
the context of an outbreak. 

= Rapidly scale existing surveillance systems 
to make local, statewide, and national data 
actionable in a timely manner to meet 
demands at each level in a crisis. 

= Provide data and information in sufficient 
granularity to meet the needs of local, 
state and national public health needs. 

= Rapidly add new information system 
capability (e.g., contact tracing for COVID- 
19 control) in the context of an outbreak. 
= Routinely onboard, validate, and use new 
electronic data sources 
= Ensure high quality, timely, complete and 
accurate data. 

= Incubate innovations in technology and 
foster workforce capacity 

= Operate with leadership and governance 
that is independent from jurisdictional 
boundaries and with authority to define 
requirements and engage public and 
private partners. 

= Educate leaders and the public regarding 
the purpose of the infrastructure and the 
essential role of science to inform public 
policy. 


Emerging Innovations and Opportunities 
Advances in informatics methods and 
tools, CDSS and health interoperability 
standards, and the increasing availability 
of clinical and other novel sources of data 
(e.g., individual genomic sequences, patient- 
generated health data from wearable devices), 
are all impacting the way business is done at 
public health agencies, leading to new oppor- 
tunities. Public health data collection systems 
are a potential gold mine for applying novel 
analytic or visualization tools, and advances 
in health data standards are improving oppor- 
tunities to exchange data and knowledge, 


allowing for 2-way communication between 
public health and clinical systems. Closing the 
information loop with providers by enabling 
data aggregated by public health to be used 
for clinical decision making is a long-time 
goal that is becoming more achievable over 
time. 

The development of sophisticated knowl- 
edge management and decision support meth- 
ods represents a growing opportunity to more 
effectively use public health resources and 
EHR data. For example, there are an increas- 
ing number of public health guidelines that, 
if structured and encoded as standardized 
digital algorithms, can be widely distributed 
to provide CDS at the point of care through 
EHRs (e.g., laboratory testing criteria for 
coronavirus). Similarly, there are advances in 
harmonizing and standardizing case defini- 
tions, reporting logic, and patient summaries 
that are relevant for identifying conditions of 
interest to be reported from an EHR to a pub- 
lic health agency (1.e., to support electronic 
case reporting or death certificate records 
reporting). These knowledge resources are a 
necessary component for automating event 
detection and the generation of electronic case 
reports but are not sufficient. Ongoing efforts 
are focused on improving data and exchange 
standards, testing and evaluating various 
implementation strategies, and increasing col- 
laboration between the multiple stakeholders 
involved (e.g., the vendor, clinical, and public 
health communities). Eventually, successful 
implementation requires 2-way communica- 
tion with providers, which is particularly rel- 
evant for case reporting where it is important 
for providers to be promptly alerted about 
unusual disease or risk factors in their local 
area, such as disease clusters, environmental 
hazards, and antibiotic resistance patterns. 


Immunization Information 
Systems: A Public Health 
Informatics Exemplar 


18.4.2 


Immunization Information Systems (IIS), also 
known as immunization registries, are confi- 
dential, computerized, population-based sys- 
tems that collect and consolidate vaccination 
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data from vaccine providers and offer tools 
for designing and sustaining effective immu- 
nization strategies at the provider and pro- 
gram levels (CDC 2013). In the U.S., an IIS is 
operating in almost every state and can share 
data with other IIS using national standards. 
IIS are a critical resource for rapidly identify- 
ing and reporting gaps in immunization cov- 
erage, such as the precipitous decline in child 
vaccinations observed as the COVID-19 pan- 
demic emerged in spring 2020 and well-child 
visits transitioned to telemedicine video con- 
ferences. Administration of immunizations 
require an in-person visit. Using HS data from 
Michigan, public health authorities were able 
to report that “vaccination coverage declined 
in all milestone age cohorts, except for birth- 
dose hepatitis B coverage, which is typically 
administered in the hospital setting. Among 
children aged 5 months, up-to-date status 
for all recommended vaccines declined from 
approximately two thirds of children dur- 
ing 2016-2019 (66.6%, 67.4%, 67.3%, 67.9%, 
respectively) to fewer than half (49.7%) in 
May 2020” (Bramer et al. 2020). 

The success of IIS in the U.S. is a com- 
pelling story that illustrates the principles 
of public health informatics and describes 
the multi-decade efforts that were needed to 
implement large-scale population-level IIS 
in each state and link them into a reliable 
national network. The story highlights how 
informatics capabilities in leadership, collab- 
oration, and collective problem solving have 
been critical to overall success. In addition to 
their orientation to prevention, IIS can func- 
tion properly only through continuing inter- 
action with the health care system; in fact, 
they were designed to optimize connections 
for use in the clinical setting. Although IIS 
are among the largest and most complex pub- 
lic health information systems, the successful 
implementations and extensive interoperabil- 
ity in 49 states show conclusively that it has 
been possible over time to overcome the chal- 
lenging informatics problems they present. 

The major functions of IIS include (a) 
the ability to accept immunization records 
electronically using a variety of file formats 
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and messaging standards, with online secure 
access to patient immunization records 24/7, 
(b) providing vaccine forecasting/decision 
support based on a patient’s consolidated 
and de-duplicated immunization history, (c) 
supporting vaccine inventory management 
and vaccine ordering, producing official 
immunization records for school and other 
institutional enrollment, (d) generating immu- 
nization coverage reports for an individual 
provider, clinical practice or jurisdiction, and 
(e) supporting the national Vaccine Adverse 
Event Reporting System (VAERS). 


= History, Context and Success of IIS 
Childhood immunizations are among the most 
successful public health interventions, result- 
ing in the near elimination of nine vaccine pre- 
ventable diseases that historically extracted a 
major human toll in terms of both morbidity 
and mortality (IOM 2000). The need for IS 
stems from the challenge of assuring complete 
immunization protection for the approximately 
10,388 children born each day in the U.S. in the 
context of three complicating factors: the scat- 
tering of immunization records among mul- 
tiple providers given that a complete history is 
essential to providing an accurate forecast of 
vaccines needed at a visit; new vaccines, vac- 
cine combinations. and antigen formulations 
regularly made available leading to a schedule 
that is increasingly complex as the number of 
vaccines has increased to over 25 doses rec- 
ommended by age 6; and the conundrum that 
the very success of mass immunization has 
reduced the incidence of disease, lulling par- 
ents and providers into a sense of complacency 
where potential vaccine side effects often get 
more attention than the diseases. 

The IIS history from the 1990s is remark- 
able for a number of key success factors includ- 
ing the committed leadership and shared 
vision. It is a story of a long slow discovery 
of challenges and the successful collaboration 
of many stakeholders to overcome those chal- 
lenges over several decades. Understanding 
the lessons learned is foundational to success- 
ful implementation of similar large-scale pub- 
lic health information systems in the future. 
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During 1989-91, a large-scale measles epi- 
demic occurred in the U.S. that highlighted the 
dangers of inadequate vaccination coverage 
among preschool aged children and resulted 
in approximately 165 deaths and 60,000 cases 
of disease. In response, a scientific community 
was established in the early 1990s to focus 
on testing, building and sharing knowledge 
to advance the impact of IIS. The complex 
and challenging issues confronted included 
policy, organizational concerns, funding, and 
a rapidly changing technical landscape. The 
collaborative nature of the scientific learn- 
ing community led to open sharing of suc- 
cesses and failures which led to consistent IIS 
advancement. 

Leading the way towards HS, the Robert 
Wood Johnson Foundation (RWJF) was a 
key funder of the pioneering All Kids Count 
(AKC) program established in the 1990s. 
Under this program, an IIS learning commu- 
nity was created and led by the Taskforce for 
Global Health based in Atlanta, Georgia. The 
AKC program provided critical national lead- 
ership early in the IIS movement by distribut- 
ing and managing competitive grants to over 
a dozen cities, counties and states that funded 
IIS implementations. The AKC, CDC and oth- 
ers were also key disseminators of the rapidly 
growing body of IIS knowledge, including 
the sponsorship of multiple meetings of the 
IIS learning community. As private funding 
ended, the American Immunization Registry 
Association (AIRA) was founded and became a 
key national focal point for the ITS community 
for coordination, cooperation, collaboration 
and policy development and engagement with 
the stakeholders, vendors and others in the pri- 
vate sector and expanded the community of 
interest nationwide. These efforts were funded 
by CDC as well as states and local jurisdictions. 

An example of successful collaboration 
came from the need for timely exchange of 
a high volume of immunization information 
accurately and consistently with health pro- 
viders. This led to the first public health HL7 
version 2 messaging standards and guides for 
immunization information (see ® Chap. 7), 
beginning in 1995. 

As knowledge was gained from early 
implementation successes and failures, the 


© Table 18.3 Essential IIS Infrastructure 
Functional Standards v4.0 2017 (CDC 2018) 


1.0 The IIS contains complete and timely 
demographic and immunization data for children, 
adolescents, and adults residing or immunized 
within its jurisdiction. 


2.0 The IIS identifies, prevents, and resolves 
duplicated and fragmented patient records using an 
automated process. 


3.0 The IIS identifies, prevents, and resolves 
duplicate vaccination events using an automated 
process. 


4.0 The IIS implements written and approved 
confidentiality policies that protect the privacy of 
individuals whose data are contained in the system. 


5.0 The IIS implements comprehensive account 
management policies consistent with industry 
security standards. 


6.0 The IIS is physically and digitally secured in 
accordance with industry standards for protected 
health information, security, encryption, uptime, 
and disaster recovery. 


7.0 The IIS supports IIS users who access and use 
the IIS functions and submit or access IIS data. 


8.0 The IIS exchanges data with health information 
systems in accordance with current interoperability 
standards endorsed by CDC for message content, 
format, and transport. 


Source. Centers for Disease Control and Preven- 
tion. (2018). IIS functional standards, v4.0 
Atlanta, GA. Retrieval 08/31/2018: » https:// 
www.cdc.gov/vaccines/programs/iis/functional- 
standards/func-stds-v4-0.html 


IIS community also developed Essential 
Infrastructure Functional Standards, codi- 
fying years of experience in refining system 
requirements (@ Table 18.3), (CDC 20185). 
Version 4.0 of the standards identifies eight 
critical functions needed for IIS implementa- 
tion that are supported by 44 detailed stan- 
dard requirements, which has led to greater 
IIS functional uniformity nationwide. 


5 Centers for Disease Control and Prevention. (2018). 
IIs functional standards, v4.0 Atlanta, 
GA. Retrieval 08/31/2018: » https://www.cdc.gov/ 
vaccines/programs/iis/functional-standards/func- 
stds-v4-0.html 
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a Key Informatics Issues in Immunization 
Information Systems 

The implementation, upgrading and manage- 
ment of IIS present challenging informatics 
issues in at least six areas: (1) Stakeholder 
collaboration and interdisciplinary commu- 
nications; (2) Legislative and policy issues 
including privacy; (3) Funding, sustainability 
and governance; (4) Data quality and moni- 
toring; (5) System design and interoperability; 
and (6) Limited prior experience with similar 
types of systems. While the specific manifesta- 
tions of these issues are unique to IIS, these six 
areas represent the typical domains that must 
be addressed and overcome in most public 
health informatics projects. Over nearly three 
decades, many factors have contributed to the 
success of an individual IIS and also to the 
network of state IISs operating as a national 
system. @ Table 18.4 shows key informatics 
factors that are important contributors to IIS 
success. 


a Stakeholder Collaboration and 

Interdisciplinary Communications 
The organizational and collaborative issues 
involved in operating and upgrading an IS 
are challenging because of the large number 
and wide variety of users, most of whom are 
outside the IIS organization. Each of the user 
groups has distinct needs, including clinicians 
(to ensure age-appropriate vaccination), clinic 
managers (for vaccine management and order- 
ing), schools (to ensure student adherence to 
state school immunization laws), health plans 
(to measure immunization coverage among 
beneficiaries and perhaps by provider), local 
health departments (to assess immunization 
coverage in their jurisdiction and identify 
children who have fallen behind and require 
outreach), and CDC (for accountability for 
federally-funded vaccines and HS funding). 
Ensuring these diverse needs are understood, 
balanced and effectively met can be daunting 
on the typically slim governmental budgets 
on which IIS have been developed and must 
operate. 

Interdisciplinary communication is a 
key challenge in any biomedical informatics 
project—it is certainly not specific to pub- 
lic health informatics. To be useful, a public 


627 


O Table 18.4 Key Informatics factors 
contributing to the success of IIS Nationwide 


Establish a shared vision and goals and ensure 
effective leadership that includes: 


Understanding the complexity of establishing a 
population-based information system and that it 
operates within a multidimensional health 
information ecosystem. 


Planning for change and reassess how to 
accomplish goals. 


Developing an effective sustainability/business 
plan from the beginning. 


Developing a robust communications plan and 
comprehensive communications strategy. 


Focusing on immunization information as the 
primary asset; recognize that technology is a 
means to that end. 


Implement with workforce, stakeholders, and end 
users in mind and include: 


Supporting a learning community to create and 
share knowledge and address common problems 
collaboratively. 


Involving stakeholders, particularly end users, 
from the beginning. 


Designating key roles for informaticians. 
Training all staff in informatics principles. 


Create well-designed and effectively used 
information systems and include: 


Defining the system requirements to support 
users’ needs. 


Developing and implementing according to 
standards (and create standards if none are 
available). 


Using the registry information as early in the 
implementation process as possible (even if it’s 
not perfect). 


Establishing and using metrics to evaluate 
progress and quantify impact. 


health information system must accurately 
represent and enable the complex concepts 
and processes that underlie the specific busi- 
ness functions required. Information systems 
represent a highly abstract and complex set of 
data, processes, and interactions. This com- 
plexity needs to be discussed, specified, and 
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understood in detail by a variety of person- 
nel with little or no expertise in the terminol- 
ogy and concepts of information technology. 
Therefore, successful IIS implementation and 
enhancements require clear communication 
among public health specialists, immuniza- 
tion specialists, providers, IT specialists, and 
related disciplines, an effort complicated by 
gaps in a shared vocabulary and differences in 
the usage of common terms from the various 
domains. 

To deal with the communications chal- 
lenges, particularly between IT and public 
health specialists, it is essential to identify and 
engage a public health informatician who has 
familiarity with both information technology 
and public health, can perform system analy- 
ses, and understands business processes. The 
primary role of this informatician is to be 
able to effectively translate concepts between 
domains that use vocabularies that are unfa- 
miliar to others (e.g., between IT specialists 
and clinicians). Also, the informatician should 
have a deep understanding of the informa- 
tion processing context of both the current 
and proposed systems. It is also important 
for individuals from all the user communities 
related to the project to have representation in 
the decision-making processes. A clear under- 
standing and set of working definitions for 
common terms and terminology is essential 
for effective communication. 


= Legislative and Policy Issues Including 
Privacy 

Legislative and policy issues are impor- 
tant aspects of the informatics challenges 
of IIS. Federal laws including the Health 
Insurance Portability and Accountability Act 
(HIPAA) are important but State laws typi- 
cally govern who has access to HS data for 
what purposes, so system design must accom- 
modate multiple levels of role-based access 
to functionality. A major issue is whether 
patient/parent consent is required before sub- 
mission of immunization data to IIS, and, if 
so, how that consent is communicated and 
managed in the system. IIS must also be able 
to record that the patient has declined to 
receive vaccines for religious or other reasons 
as defined in state law and use that informa- 


tion to suppress vaccine forecasting/decision 
support messages and reminder-recall notices. 
Some jurisdictions have enacted regulations 
requiring providers to submit immuniza- 
tion data to IIS. Such a regulatory approach 
to ensuring information completeness is less 
burdensome as automated electronic file sub- 
missions have largely replaced manual data 
entry. Negotiating and implementing policies 
for interstate access and data exchange with 
providers and other IS is another example of 
a key issue that can be problematic. 


= Funding, Sustainability, and Governance 
Funding and sustainability are continuing 
challenges for all IIS. Naturally, an important 
tool for securing funding is development of a 
persuasive business case that shows the antici- 
pated costs, benefits and value for the IIS 
investment. A substantial body of evidence 
now shows benefits, effectiveness and costs of 
IIS (Guide to Community Preventive Services 
2015). However, many of the currently opera- 
tional IIS had to develop and effectively jus- 
tify their value before such information was 
readily available. 

Specific benefits associated with IIS include 
preventing duplicate immunizations, eliminat- 
ing the necessity to review vaccination records 
for school and day care admission, efficien- 
cies in provider offices from the immediate 
availability of complete immunization history 
information, patient-specific vaccine sched- 
ule recommendations, and managing records 
during disasters (Boom et al. 2007). The care- 
ful review of the evidence on effectiveness, 
costs and benefits of specific immunization 
IIS functions may also be helpful in prioritiz- 
ing system enhancement requirements. 

Governance issues are also critical to suc- 
cess of implementation and enhancements to 
IIS. All the key stakeholders need to be rep- 
resented in the ongoing, open and transpar- 
ent decision-making processes, guided by a 
mutually acceptable governance mechanism. 
IIS require established rules for identify- 
ing needed enhancements, prioritizing them 
across the often-disparate needs of diverse 
user groups, and effectively managing and 
communicating the changes as they are devel- 
oped and implemented. Governance can also 
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be used for establishing metrics for progress, 
such as number of provider sites enrolled and 
trained, setting other priorities, and for ongo- 
ing review of confidentiality policies. 


= Data Quality and Monitoring 

Ensuring high quality data (including com- 
pleteness, accuracy and timeliness) is vital to 
the success of an IIS. Provider use of an IIS 
increases with the confidence that they can 
trust that an IIS query will find the patient 
needing an immunization, the immunization 
history is complete and accurately consoli- 
dated, and the forecast of the immunizations 
that are due is reliable. High quality data is 
essential to support these core functionalities 
and thereby provide benefit to all the stake- 
holders. A variety of methods are used to 
maintain high quality data, including qual- 
ity checks at the time of data acquisition and 
maintaining robust data feedback programs 
so all stakeholders can contribute to data 
quality and integrity. 

Monitoring levels of vaccine use also 
relies on high quality data. Examples include 
calculations of immunization rates across 
jurisdictions such as Medicaid, HEDIS, 
and supporting the National Immunization 
Survey (NIS). Quality data is also needed for 
outbreak support to provide rapid informa- 
tion on vulnerable populations and ongoing 
assessment of the response. In addition, local 
clinical community and small area analysis 
can help identify groups in need of commu- 


629 


nity outreach for assistance. Special studies 
on intervention trends in immunizations and 
effectiveness studies are other key uses of the 
data that require high quality data. 

Examples of challenges for data qual- 
ity include unmerged records on individuals, 
or wrongly combined records; unreported 
mobility where address, and other loca- 
tion information is not recorded or wrongly 
recorded; errors in data arriving from the pro- 
viders or the birth records; duplicate records 
and inadequately de-duplicated records on 
individuals and on vaccines reported from 
several sources; and partial reporting of vac- 
cines administered. Each of these data quality 
problems needs strong mitigation and moni- 
toring strategies. 

High data quality is the cornerstone of 
successfully reaching immunization program- 
related goals. IIS Functional Standards 
related to data quality are woven into the CDC 
AIRA HS Essential Infrastructure Functional 
Standards (@ Table 18.3) and are reflected in 
multiple goals in that document (CDC 2018). 
This underscores the importance of planning 
for and ensuring data quality in all aspects of 
access and use of IIS data and functionality. 

CDC and state HS Directors have estab- 
lished a detailed monitoring and measure- 
ment system that uses about 100 measures for 
tracking progress to annually assess adher- 
ence to desired capabilities and standards. 
For example, > Fig. 18.2 shows the percent- 
age point differences between NIS and IIS for 


Difference > 10 percentage points (NS>NIS] 
(1 awardee) 


Difference within 10 percentage points 
(29 awardees) 


Difference > 10 percentage points (NIS>IIS] 
(24 awardees) 


No IIS 

(1 awardee) 

No data 

(1 awardee) 
© we 
© Philadelphia 
@ © 
@ San Antonio 
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Houston 


O Fig. 18.2 Percentage point differences between National Immunization Survey (NIS) and IIS for combined 
7-vaccine series* completion — IIS Annual Report, United States, 2017* (Source: CDC 2018) 
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combined 7-vaccine series completion. This 
type of monitoring and analysis demonstrates 
that more universal use of IIS information 
has the potential to replace existing, expen- 
sive surveys and enable more timely data to 
support users and community needs. In addi- 
tion, AIRA has facilitated the development 
of numerous best practice guidelines to help 
ensure ongoing quality improvement, effi- 
ciency and increased standardization. 


a System Design and Information 

Architecture 
System design and information archi- 
tecture are important factors in the suc- 
cess of IIS. Difficult design issues include 
data acquisition methods, database orga- 
nization, identification and matching of 
individuals, generating immunization rec- 
ommendations, access to data, protocols for 
electronic exchange and interoperability, and 
reports related to clinical practice and com- 
munity rates of immunization. Acquiring 
immunization data has been a challenging 
system design issue and an area of consider- 
able IIS change as EHR use has become more 
common, new adolescent and adult immuni- 
zations are added to IIS, and information is 
submitted from a broader range of settings 
like pharmacies. Within the context of busy 
pediatric and primary care practices (where 
the majority of immunizations are given), the 
data acquisition strategy must by necessity be 
extremely efficient and result in minimal addi- 
tional work for participating providers. Use 
of EHR systems can effectively support this 
strategy. Although most physician practices 
are using EHRs, only a minority have enabled 
bidirectional exchange with IIS. 

Database design must support the desired 
IIS functions and allow efficient implemen- 
tation of these capabilities. The design must 
consider operational needs for data access 
for an individual record and calculating indi- 
vidual forecasts of needed immunizations, 
and the requirements for population-based 
immunization assessment, management of 
vaccine inventory, and generating recall and 
reminder notices. One example of a partic- 
ularly important database design decision 
for IIS is whether to represent immuniza- 


tion information by vaccine or by antigen. 
Vaccine-based representations map each 
available preparation, including those with 
multiple antigens, into its own specific data 
element. Antigen-based representations 
translate multi-component vaccines into 
their individual antigens prior to storage. In 
some cases, it may be desirable to represent 
the immunization information both ways. 
Specific consideration of required response 
times for specific queries must also be fac- 
tored into these key design decisions. 

Identification and matching of individuals 
within HS is another critical issue. Because it 
is very common for an individual to receive 
immunizations from multiple providers, any 
system must be able to match information 
from multiple sources to assemble a complete 
unduplicated record of immunizations and 
retain the sources of that information. In the 
absence of a national unique patient identi- 
fier, most IIS assign an arbitrary number to 
each individual and use a matching algorithm, 
which utilizes multiple items of demographic 
information to assess the probability that two 
records are really from the same person and 
can detect duplicate reports of an immuniza- 
tion. Development of such algorithms and 
optimization of their parameters has been 
the subject of extensive investigation in the 
context of IIS, particularly with respect to de- 
duplication (PHII 2006°). 

Another critical design issue is generating 
vaccine recommendations from an individual’s 
prior immunization history, based on guid- 
ance from the CDC’s Advisory Committee on 
Immunization Practices (ACIP). As more vac- 
cines have become available, both individually 
and in various combinations, the immuniza- 
tion schedule has become increasingly com- 
plex, especially if any delays occur in receiving 
doses, an individual has a contraindication, or 
local issues require special consideration. The 
language used in the written guidelines can be 


6 Public Health Informatics Institute. (2006). The 
Unique Records Portfolio. Decatur, GA: Public 
Health Informatics Institute. Retrieval 08/29/18: 
> https://phii.org/resources/view/4380/unique- 
records-portfolio-guide-resolving-duplicate- 
records-health-information 
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ambiguous with respect to definitions, e.g., for 
ages and intervals, making implementation 
of CDSS problematic. Considering that the 
recommendations are updated relatively fre- 
quently, maintaining software that produces 
accurate immunization recommendations 
is a continuing challenge. Accordingly, the 
implementation, testing, and maintenance of 
decision support systems to produce vaccine 
recommendations has also been the subject of 
extensive study (Yasnoff 2014). 

Finally, easy access to the information in 
IIS is essential. While independent web-based 
interfaces are common, the ideal is to provide 
a seamless query launched within the context 
of the provider’s EHR workflow that returns 
IIS information and forecast to be incorpo- 
rated into the EHR. Similarly, the design 
should support efficient access to summary 
reports on immunization rates for a clinic 
or community, reports on children who are 
behind schedule and support delivery of elec- 
tronic reminder or recall notices to support 
prevention. Consumers’ direct access to their 
own immunization record is desirable; how- 
ever, there are design considerations regard- 
ing security, allowable data views and editing 
rights, so this is currently the subject of con- 
siderable experimentation and testing. 


18.4.3 Global Health Perspective 
and Opportunities 


Global health has been defined as “an area 
for study, research, and practice that places 
a priority on improving health and achiev- 
ing equity in health for all people worldwide.” 
(Koplan et al. 2016) Improving health glob- 
ally requires accurate and timely data that can 
be effectively applied to education, research, 
innovation and service. Thus, global health 
work provides a vision of a future of infor- 
matics closely integrated with public health 
practice. 


= Current state 
Today, we live in a globally interconnected 
world where no person is more than 36 hours 
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away from any other person (Friedman 2007). 
Almost every corner of the world has access 
to the Internet and in most countries, even 
the poorest, many segments of those soci- 
eties have cellular and even smart phone 
access to the Internet. This rapid diffusion of 
these information technologies has sparked 
increased informatics activities to promote 
health progress. 

For example, the U.S. Global AIDS 
Initiative, which became known as the 
President’s Emergency Plan for AIDS Relief, 
or PEPFAR, which began in 2003, has relied 
heavily on informatics infrastructure. Today, 
PEPFAR is working in more than 120 coun- 
tries and has resulted in massive investment 
in health system infrastructure: laborato- 
ries, primary care services, medical supplies, 
drug supply chains, and information sys- 
tems. These same health systems have ben- 
efitted further from funding from the Bill 
and Melinda Gates Foundation (BMGF), 
which has sparked marked progress across 
vaccine preventable diseases, neglected tropi- 
cal diseases and many other conditions that 
impact the poorest throughout the world. 
These social entrepreneurs also bring strong 
accountability to global health — they expect 
their funds to produce a specific health out- 
come. Tracking measured outcomes requires 
informatics. 

Recognizing these trends, in 2013 the 
U.S. government funded the global health 
security agenda (GSHA), which promotes 
a range of health development activities 
across the lower income tier of countries. 
The agenda is designed to improve dis- 
ease surveillance, laboratory and diagnos- 
tic capacity, workforce capacity, and basic 
health services. The GSHA has set the stage 
for widespread recognition that informatics 
is an essential tool towards health progress 
in these countries. 

It is clear from these efforts that informat- 
ics will drive health improvement in low and 
middle income countries. Consequently, most 
countries now acknowledge, as does the World 
Health Organization, that their health efforts 
must be data informed and data driven. 
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a Example of Informatics Innovation 
Leading to Health Impact: Global 
Trachoma Mapping 

Trachoma is a bacterial disease that spreads 

through contact with the eyes or nose of an 

infected individual, shared towels or clothing, 
and vectors such as flies. Repeated infections 
cause eyelashes to turn inward, scratching 
and scarring the cornea, leading to blind- 
ness. About 1.9 million people in 42 countries 

are currently blind or visually impaired as a 

result. Treating the infection along with access 

to clean water and sanitation leads to elimina- 
tion throughout entire populations. 

In 2009, the International Trachoma 
Initiative (ITI), a program of the Task Force 
for Global Health (TFGH) and their partners 
recognized the need for better information 
about the prevalence of trachoma to inform 
decisions about where to implement specific 
intervention strategies such as eye surgery, 
antibiotic treatment, facial hygiene, and envi- 
ronmental sanitation (ITI 2012). In 2012, in 
partnership with ITI, the UK’s Department 
for International Affairs announced the 
three-year Global Trachoma Mapping Project 
(GTMP) to complete The Global Atlas of 
Trachoma to map global trachoma preva- 
lence.’ Field teams used mobile devices to 
record findings of trachoma infection. Data 
management software converted the house- 
hold-level findings to prevalence maps at the 
village and district levels. An electronic data 
capture and management tool called LINKS 
was adapted for GIMP. The interface was 
designed to be basic enough for field staff 
to understand and secure enough to prevent 
accidental data loss or breaches. 

By 2016, more than 2.6 million people 
had been surveyed for trachoma to create 
prevalence maps of 29 countries, the larg- 
est infectious disease map ever created. The 
availability of high-quality data on trachoma 


7 Taskforce for Global Health. (2018). The global tra- 
choma mapping project: determining prevalence to 
help eliminate trachoma by 2020. Retrieval 10/20/18: 
> https://www.taskforce.org/case-study/global-tra- 
choma-mapping-project-determining-prevalence- 
help-eliminate-trachoma-2020 


had widespread impact. The Atlas helped 
inform the scale-up of the global trachoma 
elimination program. With the availability of 
prevalence data, ITI was able to make better 
decisions about where to allocate antibiotic 
supplies and coordinate production. That 
same year, ITI reached scale when Pfizer 
donated 120 million antibiotic treatments to 
32 countries, bringing the cumulative program 
total to 627 million treatments (Taskforce for 
Global Health 2018). 


= Emerging Opportunities and Directions 
The African Union formed Africa CDC in 
2017. Rather than being disease focused, it 
is built around the key disciplines that pro- 
tect and promote the health of populations, 
1.e., informatics, surveillance, drug and medi- 
cal supply chain, diagnostics/laboratory, and 
policy. As a result, a key goal is to educate a 
new cadre of informaticians throughout the 
African continent to assure that timely, accu- 
rate, and relevant data inform action across 
the spectrum of health needs. 

Ubiquitous and reliable high-speed band- 
width also now readily enables worldwide 
collaborations. The notion of global to local 
shared learning is now a reality, with com- 
munities of practice forming that involve 
members from all continents. Learning is now 
multidirectional. 

In the global health community, the lack 
of legacy institutions combined with very 
limited resources has created strong incen- 
tives for innovation. When combined with the 
insistence on specific, real-time quantitative 
results and outcomes from funding sources, 
the extensive use of informatics has been an 
inevitable consequence to ensure efficient and 
effective progress in health improvement. 


18.5 Public and Population Health 
Informatics Conclusion 


Public health informatics is the systematic 
application of informatics methods and tools 
to support public health goals and outcomes, 
regardless of the setting. Effective public 
health information systems and communi- 


Population and Public Health Informatics 


cation between clinical and other systems 
can help to assure prevention actions, timely 
monitoring of disease patterns, and rapid 
responses to epidemics, thereby saving lives 
and money. 

Public health information and the devel- 
opment of health information infrastructure 
(HII) (see > Chap. 15) are closely related. 
Public health informatics supports the popu- 
lation assessment, assurance and policy devel- 
opment roles of public health. In contrast, 
HUs focus on medical care to individuals while 
also connecting providers and patients within 
a population. Ideally, these two areas work 
together supporting both community health 
assessment and individual care. In the past, 
public health and health care have not tradi- 
tionally interacted as closely as they should. 
Both domains focus on the health of commu- 
nities—public health does this directly, while 
the medical care system does it one patient 
at a time. However, it is now clear that medi- 
cal care must also focus on the community 
to integrate the effective delivery of services 
across all care settings for all individuals (TOM 
2011; Sittig and Singh 2020). An effective HII 
could allow many public health information 
needs currently met through independently 
operated and maintained systems to be more 
efficiently addressed via periodic HII queries 
(e.g., to assess relationships between various 
diseases, conditions, treatments, and possible 
risk factors) or through automatic real-time 
reporting of relevant information from the 
HII to public health (e.g., to support surveil- 
lance and control of COVID-19). 

Successful public health and population 
health informatics requires an informatics- 
savvy organization that has a clear vision, 
strategy, and governance for information 
management and use; a workforce skilled in 
using information and information technolo- 
gies; and well-designed and effectively used 
information systems (Baker et al. 2016). The 
information imperative is urgently driven by 
the increasing digitization of data coming 
into health departments from an increasing 
number of sources, the need for timely infor- 
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mation to inform increasingly complex pub- 
lic health decisions, and the growing costs of 
operating aging public health information sys- 
tems (Brand et al. 2018a). Information inno- 
vation to address growing needs requires an 
agency-wide informatics-savvy organizational 
approach (LaVenture et al. 2014c, 2014d; 
2017c, 2017d) and an appreciation that there 
are key stages of innovation to ensure the suc- 
cessful integration of research in the public 
health practice space (Xu et al. 2011). 

Public health systems frequently 
involve non-health organizations such as 
law enforcement and parks and recreation 
departments. Thus, public health informati- 
cians must adopt methodologies that bridge 
professional and organizational divides, such 
as the Public Health Informatics Institute’s 
Collaborative Requirements Methodology 
(PHII 2011). 

Despite the focus of many current public 
health informatics activities on population- 
based extensions of the medical care system 
(leading to the orientation of this chapter), 
applications beyond this scope are possible, 
desirable, and many innovative strategies and 
applications are under development or in 
use. Indeed, the phenomenal contributions 
to health made by the hygienic movement of 
the nineteenth and early twentieth centuries 
suggest the power of large-scale environmen- 
tal, legislative, and social changes to promote 
human health (Rosen 1993). The effective 
application of informatics to populations 
through public health is a key challenge of the 
twenty-first century. It is a challenge we must 
accept, understand, and overcome if we want 
to create an efficient and effective health care 
system as well as truly healthy communities 
for all. 


(e) Suggested Readings 

American Immunization Registry Association 
(AIRA). (2018). Fundamentals of 
IIS. Retrieval Oct 24, 2018 http://repository. 
immregistries.org/resource/fundamentals-of- 
iis/from/major-iis-topics/best-practices-2/. A 
summary of fundamental understanding of 
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three essential IIS topics. Staff, Data Quality, 
Interoperability and HL7 basics.. 

Brand, B., LaVenture, M., Lipshutz, J., Stevens, 
W. F., & Baker, E. L. (2018). The information 
imperative for public health: A call to action 
to become informatics-savvy. Journal of Public 
Health Management and Practice: JPHMP, 24 
(6), 586-589. A summary of reasons for the 
urgent need for informatics and some strate- 
gies for moving organizations forward. 

Centers for Disease Control and Prevention 
(CDC). (2013). Progress in immunization 
information systems — United States, 2011. 
MMWR Morb Mortal Wkly Rep. 62(3), 48-51. 

Centers for Disease Prevention and Control. 
(1999). Ten great public health 
achievements — United States, 1900-1999. 
MMWR; 48(12);241-243. 

Centers for Disease Control and Prevention. 
(2018). IIS functional standards, v4.0 Atlanta, 
GA: Centers for Disease Control and 
Prevention. 

Centers for Disease Control and Prevention. 
(2012). CDC’s Vision for Public Health 
Surveillance in the 21st Century. MMWR 
2012;61(Suppl; July 27, 2012). 

Department of Health and Human Services. 
(1994). Essential public health functions. 
Public Health in America. Washington DC: 
Department of Health and Human Services. 

Hinman, A. R., & Ross, D. A. (2010). Immunization 
registries can be building blocks for national 
health information systems. Health Affairs, 
29(4), 676-682. Describes the components of 
IIS and how they are also key elements in the 
national health infrastructure. 

Infectious Diseases Society of America (ISDS). 
Stories of Surveillance in Action. Retrieval 
October, 25th, 2018 https://www. 
surveillancerepository.org/search/success-sto- 
ries and https://www.healthsurveillance.org/ 
default.aspx Contains many practical exam- 
ples on how public health surveillance is used 
to support clinical care and community health. 

International Trachoma Initiative (ITI). (2012). 
Global Scientific Meeting on Trachomatous 
Trichiasis (TT). DeCatur, GA: The Task Force 
for Global Health. 


Koo, D., O’Carroll, P. W., & LaVenture, M. (2001). 
Public health 101 for informaticians. Journal of 
the American Medical Informatics Association, 
8(6), 585-597. An accessible document that 
introduces public health thinking and 
approach. 

LaVenture, M., Brand, B., & Baker, E. L. (2014a). 
Developing an informatics-savvy health 
department: I: Vision and core strategies. 
Journal of puBlic Health Management and 
Practice: JPHMP, 20 (6), 667-669. 

LaVenture, M., Brand, B., & Baker, E. L. (2014b). 
Developing an informatics-savvy health 
department: II: Operations and tactics. 
Journal of Public Health Management and 
Practice: JPHMP, 21(1), 96-99. 

LaVenture, M., Brand, B., & Baker, E. L. (2017a). 
Developing an  informatics-savvy health 
department: from discrete projects to a coordi- 
nating program. Part I: Assessment and gover- 
nance. Journal of Public Health Management 
and Practice: JPHMP, 23(6), 325-327. 

LaVenture, M., Brand, B., Ross, D. A., & Baker, 
E. L. (2017b). Developing an informatics- 
savvy health department: from discrete proj- 
ects to a coordinating program. Part II: 
Creating a skilled workforce. Journal of Public 
Health Management and Practice: JPHMP, 
23(6), 638-640. 

Magnuson, J. A., & Dixon, B. E. (Eds.). (2020). 
Public health informatics and informatics sys- 
tems. New York: Springer. A comprehensive 
textbook. 

Overhage, M., Grannis, S., & McDonald, C. J. 
(2008). A comparison of the completeness 
and timeliness of automated electronic labo- 
ratory reporting and spontaneous reporting 
of notifiable conditions. American Journal of 
Public Health, 98(2), 344-350.This study doc- 
uments the improved quality and timeliness of 
electronic lab reporting of diseases to public 
health. 

Public Health Informatics Institute. (2006). The 
Unique Records Portfolio. Decatur, GA: 
Public Health Informatics Institute. 

Taskforce for Global Health. (2018). The global 
trachoma mapping project: determining prev- 
alence to help eliminate trachoma by 2020. 
Decatur, GA: Taskforce for Global Health. 
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Trust for America’s Health. (2020). Impact of 
Chronic Underfunding on America’s Public 


Health System: Trends, Risks, and 
Recommendations. Trust for America’s 
Health. 


US Department of Health and Human Services. 
(2018). Healthy people 2030 framework. 
Washington, DC: US Department of Health 
and Human Services. Retrieved at https:// 
www.healthypeople. gov/2020/A bout-Healthy- 
People/Development-Healthy-People-2030/ 
Framework (Accessed 23 Oct 2018). This is an 
essential guide to the national public health 
goals. 

Yasnoff, W. A., O’Carroll, P. W., Koo, D., Linkins, 
R. W., & Kilbourne, E. M. (2000). Public 
health informatics: Improving and transform- 
ing public health in the information age. 
Journal of Public Health Management and 
Practice, 6(6), 67-75. A concise yet compre- 
hensive introduction to the field. 


® Questions for Discussion 

1. How might the trend of widespread 
adoption of electronic health records 
and interest in population health affect 
public health informatics? 

2. Compare and contrast the types of data 
needed and functions required in an 
information system for clinical versus 
public health information systems. 
Explain it from non-technical and tech- 
nical perspectives. 

3. How can the successful model of immu- 
nization registries be used in other 
domains of public health (be specific 
about those domains)? How might it fail 
in others? Why? 

4. A significant and increasing percentage 
of the U.S. GDP is spent on medical 
care. How could population and public 
health informatics help to use those 
monies more efficiently? Or lower the 
figure absolutely? 

5. If public health informatics (PHI) 
involves the application of information 
technology in any manner that improves 
or promotes human health, does this 
necessarily involve a human “user” that 
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interacts with the PHI application? For 
example, could the information technol- 
ogy underlying anti-lock braking sys- 
tems be considered a public health 
informatics application? Provide other 


examples. 
6. How might cloud computing (shared 
configurable computing resources 


including networks, servers, storage, 
applications, and services), and mobile 
technology transform public health 
informatics? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What is mobile health (mHealth), and 
how has it evolved over time? 

= What are the current and potential 
values and benefits of mHealth for (a) 
patients and caregivers, (b) the general 
public, (c) clinicians, and (d) research- 
ers? 

= What are the key features of mHealth 
in supporting personal health manage- 
ment? 

= What ethical considerations and issues 
do mHealth technologies raise regard- 
ing health disparities? 


19.1 Introduction 


The field of mobile health (mHealth) focuses 
on the uses of mobile technologies, such as 
mobile phones and wearables, to support 
both the delivery of healthcare services as well 
as individuals’ efforts to manage their health 
in everyday lives. mHealth applications are 
highly varied, ranging from clinical tools for 
remote patient monitoring and shared deci- 
sion making to patient-centered toolsintended 
to help individuals better manage their health 
in daily life, such as increasing physical activ- 
ity or controlling one’s blood glucose levels. 
Although applications of mHealth overlap 
with other areas of informatics research and 
practice—such as personal health informatics 
(> Chap. 11) and telehealth (> Chap. 20), in 
the context of mHealth, these applications 
have a distinct flavor largely due to the omni- 
presence of mobile technology and the unique 
features that this omnipresence enables. In 
this chapter, we trace the history of mHealth, 
highlight key features of these technologies 
that make them uniquely suited for delivery of 
health interventions, and provide an overview 
of health services and health promotion for 
which they are used. We conclude with a brief 


discussion of ethical issues raised by the rapid 
growth of mHealth. 


19.1.1 The Omnipresence of Mobile 


The rapid advancements and widespread 
adoption of mobile technologies have altered 
how healthcare providers practice medicine, 
how patients access healthcare and manage 
their health and well-being, and how research- 
ers design and evaluate health interventions. 
Called mHealth, both industry and research 
sectors have been increasingly developing 
devices and applications that leverage mobile 
and wireless communication to deliver health- 
care services or to support people managing 
their own health and well-being. 

Mobile technologies are omnipresent. 
Since 1984 when the first handheld mobile 
phone called DynaTAC (see © Fig. 19.1) was 
introduced (Murphy 2013), mobile comput- 
ing and communication devices have been sig- 
nificantly advanced and widely adopted. As 
of 2018, 95% of U.S. adults own a mobile 
phone of some kind, 77% own a smartphone, 
and 53% own a tablet computer.! Mobile 
phone ownership is widespread worldwide: 
59% of people own a smartphone and a fur- 
ther 31% own other types of mobile devices 
such as flip phones.” mHealth devices include 
smartwatches and other wearables, technolo- 
gies well-suited for monitoring activity and 
physiological data and for the delivery of life- 
style change interventions. As of 2018, the 


1 Mobile fact sheet. 2018. Pew Research Center. 
Retrieval June 3, 2019: » https://www.pewinternet. 
org/fact-sheet/mobile/ 

2  Poushter J, Bishop C, Chwe H. June 2018. Social 
media use continues to rise in developing countries 
but plateaus across developed ones: digital divides 
remain, both within and across countries. Pew 
Research Center. Retrieval June 3, 2019: » https:// 
www.pewresearch.org/global/2018/06/19/social- 
media-use-continues-to-rise-in-developing-coun- 
tries-but-plateaus-across-developed-ones/ 
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O Fig. 19.1 A DynaTAC 8000X, a first commercially 
available mobile phone from 1984. By Redrum0486 - 
> http://en.wikipedia.org/wiki/File:DynaTAC8000X. 
jpg. CCBY-SA3.0,» https://commons.wikimedia.org/w/ 
index.php?curid=6421950 


global market for wearable devices in the 
healthcare sector reached over 2 billion U.S. 
dollars, which is a 600% increase over the 
market size in 2016.° 

Mobile devices have permeated every 
aspect of people’s lives. In a 2011 study, Dey 
and colleagues (Dey et al. 2011) found that 
people keep their mobile device in close prox- 
imity (within the same room or closer) almost 
90% of the time. With respect to usage time, 
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American adults spend about 5 hours/day on 
mobile devices.* The extensive use of mobile 
devices generates massive data from mobile 
sensors and usage behaviors, from which we 
can infer people’s behavioral and psychologi- 
cal patterns, such as sleep, activity, mood 
(LikamWa et al. 2013), and even psychologi- 
cal traits (Lee et al. 2014). The inference 
drawn from the data collected from mobile 
devices is meaningful and reliable as long as 
people keep using the device. Therefore, many 
mHealth technologies are designed to pro- 
mote high engagement and long weartime, so 
that people can “stick” with the technology. 
However, as smartphone overuse and addic- 
tion have become societal problems (Kwon 
et al. 2013), supporting healthy engagement 
with mHealth technology warrants careful 
consideration and future research. 


19.1.2 Evolution of mHealth 
Technologies 


mHealth technologies have evolved in parallel 
with the advancements in mobile communica- 
tion technologies and portable computing 
devices. Although we cannot provide a detailed 
account of the history of mHealth technolo- 
gies in this chapter, we hereby aim to provide 
key milestones and devices in the evolution of 
mHealth technologies to establish a broad con- 
text for later discussion. Specifically, we cover 
personal digital assistants (PDAs), cellular 
phones and smartphones, and wearables, each 
of which sparked important innovations in 
mHealth applications. See @ Table 19.1 for the 
summary of representative mHealth devices. 


19.1.2. PDAs and Cellular Phones 


Developed in 1993, the Apple Newton was 
marketed as the first PDA device (see 
O Fig. 19.2a). Early applications for Newton 


3 Statista. 2019. Projected size of the global market 
for wearable devices in the healthcare sector from 
2015 to 2021 (in million U.S. dollars) [Data file]. 
Retrieval June 3, 2019: » https://www.statista.com/ 
statistics/607982/healthcare-wearable-device-reve- 
nue-worldwide-projection/ 


4 Khalaf S, Kesiraju L. March 2, 2017. U.S. consum- 
ers time-spent on mobile crosses 5 hours a day. 
Flurry Analytics Blog. Retrieval June 3, 2019: 
> https://flurrymobile.tumblr.com/ 
post/157921590345/us-consumers-time-spent-on- 
mobile-crosses-5 
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© Table 19.1 
Device Example Key Technical Specs 
Category Device (Release 
year) 
Personal Palm Pilot Palm OS 2.0, 16 MHz CPU, 512 KB 
Digital Personal (1997) | Memory, 160 x 160 pixel monochrome 
Assistant display, touchscreen LCD, speaker, serial 
(PDA) hotsync port, desktop cradle 
Cellular Nokia 5500 Symbian OS 9.1, 235 MHz CPU, 8 MB 
Phone Sport Memory, 1.7” (208 x 208 pixel) display, 
(2006) camera, microphone, speaker, GSM 
network, Bluetooth connectivity, 
accelerometer, 
Smartphone Samsung Android OS 9.0, Octa-core CPU, 8GB 
Galaxy S10 Memory, 128GB Storage, 
(2019) 6.1” Quad HD+ (3040 x 1440 pixel) 
display, triple camera, GSM+CDMA/AG 
LTE network, Wi-Fi, Bluetooth 
connectivity, 
capacitive multi-touch, 
stereo speakers, microphone, 
accelerometer, barometer, gyro sensor, 
geomagnetic sensor, RGB light sensor, 
proximity sensor 
Tablet Apple iPad 7th 10.2” Retina display, LED-backlit with 
Generation Multi-Touch and IPS technology, 
(2019) 2160 x 1620 pixel display, compatible 
with Apple Pencil and Keyboard, 
Cellular data network, Wi-Fi, Bluetooth 
connectivity, cameras, two speaker audio, 
fringerprint identity sensor, gyro sensor, 
accelerometer, ambient light sensor, 
barometer 
Smartwatch A a 2 1.34” OLED display, Wi-Fi, Bluetooth 


connectivity, microphone, Accelerometer, 
optical heart rate monitor, altimeter, 
vibration motor, SpO2 sensor, NFC, 
ambient light sensor 


Representative mHealth devices and their key technical specs and supported applications 


Supported Applications 


Note taking (patient 
progress notes), calculation, 
to-do list, calendar, reference 
materials (ePocrates, 
5-Minute Clinical Consult) 


Sports (stopwatch, steps, 
calories burn) tracking, 
multimedia messaging, 
email, text to speech 


Multiple health and fitness 
applications via Google 
Play Store 


Multiple health and fitness 
applications via App Store 


Floors climbed, activity 
(e.g., walk, run, swim) 
tracking, sleep tracking with 
light, deep, and REM, 
personalized reminders, 
guided breathing, 24/7 heart 
rate tracking, resting heart 
rate, calories burn 


were designed to assist everyday tasks such as 
note-taking, calculation, and to-do list man- 
agement. The Newton was one of the first 
devices to feature pen-based handwriting rec- 
ognition, which eventually became popular in 
other handheld devices. Due to the technical 
limitations, however, the Newton was not 
widely adopted and was eventually discontin- 


ued in 1998. The Pilot 1000, introduced in 
1996 by Palm Computing, was the first widely 
accepted PDA; it offered a 160 x 160 pixels 
monochrome touch-sensitive display, a sepa- 
rate area for pen text input (using a special 
script called Graffiti), and sufficient memory 
to store reference materials. Over the next 
decade, the Pilot 1000 was followed by a large 
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O Fig. 19.2 a The Apple Newton MessagePad 2100, 
running Newton OS, alongside the original iPhone 
running iOS. By Blake Patterson from Alexandria, VA, 
USA - Newton and iPhone: ARM and ARM, CC BY 2.0, 
> https://commons.wikimedia.org/w/index. 


number of more advanced Palm devices (e.g., 
the PalmPilot Personal, © Fig. 19.2b), as well 
as Windows Mobile devices (e.g., the Pocket 
PC), which incorporated color displays, 
higher resolution screens, and eventually, cel- 
lular connectivity. Many medical apps and 
mHealth research prototypes were designed 
for these devices. 

The new modes of data entry using pen- 
based gestures, paging navigation, and menu 
palettes were shown to be effective in captur- 
ing structured data in healthcare settings. 
These interface characteristics were effective 
in overcoming some of the limitations of 
paper-based data capture, such as high cap- 
ture burden and low compliance (Lane et al. 
2006). As a means to ease the data capture 
burden, PDA applications were designed for 
doctors to write patient progress notes (Poon 
et al. 1996), for patients to capture symptoms 
(Stratton et al. 1998), and for researchers to 
collect sensitive health data in resource-poor 
field sites (e.g., Jaspan et al. 2007). 

PDAs also served as medical reference 
tools for medical students and doctors (e.g., 
with applications such as ePocrates Rx, 
5-Minute Clinical Consult, MedMath, 
MedCalc). In a 2006 review paper on PDAs 
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b Palm Pilot 


Robotics 


php?curid=7039806. b Figure PalmPilot with stylus. By 
Rama & Musée Bolo - Own work, CC BY-SA 2.0 fr, 
> https://commons.wikimedia.org/w/index. 
php?curid=36959631 


and medical education, the authors reported 
that 60-70% of medical students and resi- 
dents used PDAs for educational purposes or 
patient care (e.g., patient tracking and docu- 
mentation) (Kho et al. 2006). 

For researchers, PDAs were a good proto- 
typing platform to test mHealth applications. 
Intille, Kukla, Farzanfar, and Bakr (2003) 
used a standard PDA with a barcode scanner 
plug-in to create an application that helps 
people compare food items at the time of 
making food purchasing decisions. The appli- 
cation delivered tailored, motivational infor- 
mation with an aim to help people make 
“just-in-time” incremental changes to their 
diet (Intille et al. 2003). Siek et al. (2006) lev- 
eraged PDAs’ voice recording feature, as well 
as a barcode scanner, to create a food moni- 
toring tool for patients with chronic kidney 
disease. 

Traditionally, PDAs did not support 
phone services. However, cellular phones, 
which became popular around the same time 
as the early PDAs, did by connecting to a 
wireless communications network through 
radio waves or satellite transmissions. Early 
cellular phones were equipped with voice and 
text communication capabilities, allowing 
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researchers to design health interventions that 
deliver health messages (e.g., a text messaging 
smoking cessation program) (Free et al. 2011). 
As with PDAs, cellular phones became more 
powerful over time, adding more advanced 
features, such as multimedia messaging (e.g., 
sending and receiving images), text to speech, 
and motion sensors. These features were 
promptly incorporated into health applica- 
tions. For example, the Nokia 5500 Sport was 
the first mobile phone that had a built-in 3D 
accelerometer, and it came with an applica- 
tion that automatically detected running and 
walking. The phone also included a diary for 
planning and tracking workouts, as well as 
enabling users to add workouts that the phone 
did not detect automatically. 


19.1.2.2 Smartphones and Tablets 

IBM and BellSouth’s Simon is largely consid- 
ered the first “smartphone” because it fea- 
tured a phone with a touchscreen, email and 
many other capabilities that became available 
to the public in 1994.° Simon did not last long, 
but it presaged other smartphones, such as the 
BlackBerry and Windows Mobile. Smart- 
phones gained popularity in 2007 when Apple 
announced the first generation of the iPhone, 
which featured a large capacitive touchscreen, 
multi-touch interactions (e.g., pinching for 
zooming), and thin slate-like form factor. In 
the following year, Apple opened the App 
Store, a marketplace to distribute applica- 
tions, which became popularly known as 
“apps,” for the iPhone (and subsequently 
other iOS devices). Soon after the iPhone was 
launched, in 2008, Google released a new 
mobile OS platform, Android. The choice of 
the OS determines which phone to choose and 
which apps the user can run. Google’s Android 


5 Aamoth, D. August 18, 2014. First Smartphone 
Turns 20: Fun Facts About Simon TIME. Retrieval 
July 7, 2019: > https://time.com/3137005/first-smart- 
phone-ibm-simon/ 

6 Pothitos, A. October 31, 2016. The History of the 
Smartphone Mobile Industry Review. Retrieval July 
7, 2019: > http://www.mobileindustryreview. 
com/2016/10/the-history-of-the-smartphone.html 


runs on a wide range of devices manufactured 
by a variety of companies, whereas Apple’s 
iOS runs on Apple’s iPhone devices only. 

Due to the broad user base and conve- 
nient marketplaces (platforms) to distribute 
apps, the iPhone and Android smartphones 
have led to a rapid growth in digital health in 
the past few years. As of 2017, more than 
325,000 health and fitness apps were avail- 
able in the major mobile app stores, with 3.6 
billion downloads.’ As platforms evolve, so 
do the mHealth apps. For example, Epocrates, 
a medical reference app, has changed over 
time from its earlier version for Palm OS 
devices to its recent versions for IOS/Android 
devices. 

Although 73% of mHealth apps in 2015 
were designed for supporting general well- 
ness—for example, apps for tracking exercise 
and diet, and for managing stress, the 
mHealth market is shifting toward the sup- 
port for managing specific health conditions. 
Such applications constituted 40% of 
mHealth apps in 2017.8 Some of the key cat- 
egories of mHealth apps include: symptom 
checkers, healthcare professional finders, 
managing clinical and financial records, 
health-condition education and manage- 
ment, self-monitoring, remote patient moni- 
toring, rehabilitation programs, and 
prescription filling and compliance (or adher- 
ence). Based on a review of commercial 
mHealth apps, the most prevalent conditions 
that mHealth apps are targeting are diabetes, 
depression, migraine, asthma, low vision, 
and hearing loss (Martinez-Pérez et al. 2013). 
In addition, apps for women’s health and 
pregnancy, and medication reminders are 
now common. 


7 Pohl M. 2017. 325,000 mobile health apps available 
in 2017 — Android now the leading mHealth plat- 
form. Research2Guidance. Retrieval June 3, 2019: 
> https://research2guidance.com/325000-mobile- 
health-apps-available-in-2017/ 

8 The growing value of digital health: Evidence and 
impact on human health and the healthcare system. 
November 7, 2017. IQVIA Institute. Retrieval June 3, 
2019: » https://www.iqvia.com/institute/reports/the- 
growing-value-of-digital-health 
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O Fig. 19.3 A person wearing a smartwatch By Crew - 
> https://pixabay.com/en/smartwatch-gadget-technology- 
smart-828786/,CCO, > https://commons.wikimedia.org/w/ 
index.php?curid=46644979 


19.1.2.3 Wearable Devices 


Wearable devices include activity trackers 
(e.g., wrist-worn devices, rings, chest bands, 
belt-type, and earpieces), smartwatches (see 
O Fig. 19.3), and smart clothing (e.g., shirts, 
bras, socks, and pants). They track a variety 
of activities, such as walks, runs, workouts, 
smoking, and eating, as well as physiological 
processes, such as sleep, heart rate, breathing, 
and sun exposure. Activity trackers and 
smartwatches in particular have become 
increasingly popular (Choe et al. 2014; Fritz 
et al. 2014; Lupton 2014; Rooksby et al. 2014). 
These devices employ low-burden self- 
monitoring by leveraging wearable and mobile 
sensing to capture data; they commonly pro- 
vide behavioral feedback with coaching, goal- 
setting, and reminders. Such devices are 
available from a variety of commercial com- 
panies (Apple, Fitbit, Garmin, Fossil, Polar, 
etc.) and come in a number of form factors, 
from thin bands that resemble a bracelet (e.g., 
Fitbit Alta) to devices that aim to look like 
traditional watches (e.g., Garmin Fenix, and 
watches from Skagen). 

Although the adoption rate for activity 
trackers and smart watches in the US is 17.6%, 
they have not been taken up equally by all age 
groups, with older generation showing a sig- 
nificantly lower adoption rate (4.6% for age 65 
and up) than younger generations (30.8% 
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among 25 to 34-year-olds).° In addition, the 
abandonment of wearable device is high; 
about one third of people who own a wearable 
device abandon the device after six months.!° 
Researchers have identified a variety of rea- 
sons why people abandon these devices. For 
instance, common reasons for abandoning 
wearable devices include difficulty in deciding 
what to do with the data and disappointment 
with the level of information the devices pro- 
vide (Lazar et al. 2015). Given that a key way 
that these devices try to help people maintain 
and change behavior is by making the moni- 
tored activities more salient, the abandon- 
ment of wearable device means that the 
benefits of wearable devices might fade away 
soon after the abandonment (Klasnja et al. 
2011). To help enhance the device weartime 
and sustain the benefits of self-monitoring, 
these devices should be better integrated in 
people’s everyday life, fostering user engage- 
ment. 

Contrary to other computing equipment, 
wearable devices could serve as a fashion item. 
In this regard, companies attend to the device’s 
form factor, providing diverse customization 
options. Customizability can enhance a per- 
son’s sense of identity, which in turn is associ- 
ated with more favorable attitude, higher 
exercise intention, and a greater sense of attach- 
ment towards the track (Kang et al. 2017). 

Although many personalized options are 
available on the hardware side, designing 
effective and personalized behavioral feed- 
back on the software side warrants future 
research. There are growing bodies of litera- 


9 USwearable user penetration, by age, 2017 (% of pop- 
ulation in each group). December 19, 2016. eMarketer. 
Retrieval June 4, 2019: » https://www.emarketer.com/ 
Chart/US-Wearable-User-Penetration-by-Age- 
2017-of-population-each-group/202360 

10 Endeavour Partners. April 21, 2017. Inside wear- 
ables: how the science of human behavior change 
offers the secret to long-term engagement. Medium. 
Retrieval June 13, 2019: » https://medium.com/@ 
endeavourprtnrs/inside-wearable-how-the-science- 
of-human-behavior-change-offers-the-secret-to- 
long-term-engagement-al5b3c7d4cf3 
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O Fig. 19.4 Glanceable 
visualization on 
smartwatch. Example 
images of the stimuli used 
in the two perception 
studies in Blascheck, 
Besancon, Bezerianos, 
Lee, & Isenberg (2019) 


180-440ms* 


How quickly can we compare two data values on a smartwatch? 


180-270ms* 560-3900ms* 


* depending on number of data items (we tested 7, 12, & 24) 


ture in the ubiquitous computing (Amini et al. 
2017; Gouveia et al. 2016) and visualization 
(Blascheck et al. 2019 (see @ Fig. 19.4); 
Brehmer et al. 2018) communities that have 
begun to examine ways to create effective 
feedback on a small screen. More cross-disci- 
plinary collaboration on this front is needed 
to design and build effective intervention 
components for wearable devices (Lee et al. 
2018). 

Wearables have been popular subjects of 
research in healthcare domains due to their 
potential to serve as an intervention in a 
variety of contexts. See a systematic map- 
ping study of how “Internet of Things,” 
including wearables have been deployed and 
evaluated in the medical field (Sadoughi 
et al. 2020). 


19.1.3 Current Platforms 


The smartphone user base is sharply divided 
between Google’s Android platform and 
Apple’s iOS. As of June 2018, 54.1% percent of 
U.S. smartphone users were using a Google 


Android device, and 44.5% were using an Apple 
device.!! The divide in the market share and the 
idiosyncratic characteristics of individual 
phones pose problems for developers and 
researchers. To create a native app (i.e., an app 
that has been developed for use on a particular 
platform), developers must redouble their 
efforts to create and maintain versions for each 
platform. Researchers, who often lack skills 
and resources to build native mobile apps for 
multiple mobile platforms, must pick either of 
the two platforms instead of supporting both, 
which can introduce a selection bias in recruit- 
ment.!? To address such concerns, efforts have 
been devoted to developing frameworks and 


11 Statista. 2019. Subscriber share held by smartphone 
operating systems in the United States from 2012 to 
2019 [Data file]. Retrieval June 13, 2019: » https:// 
www.statista.com/statistics/266572/market-share- 
held-by-smartphone-platforms-in-the-united-states/ 

12 The U.S. Mobile App Report. 2014. Retrieval Janu- 
ary 15, 2020: » https://www.comscore.com/ 
Insights/Infographics/iPhone-Users-Earn-Higher- 
Income-Engage-More-on-Apps-than-Android- 
Users?cs_edgescape_cc=US 
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platforms to handle device and operating sys- 
tem compatibility. For example, Apache 
Cordova! enables developers to build cross- 
platform mobile apps with HTMLS, CSS, and 
JavaScript, with extensions that provide access 
to some hardware features such as camera, 
GPS, and Bluetooth. Titanium!* is a mobile 
development environment that allows the cre- 
ation of native apps across different mobile 
devices and operating systems, while reusing a 
large part of the codes across apps. Such cross- 
platform support enables developers of 
mHealth technologies to reach potential users 
of either Android or Apple devices. 


19.1.4 Data Access & Data 
Standards 


Accessing mHealth data could be valuable for 
many stakeholders, including self-trackers 
who want to learn insights about themselves 
(Choe et al. 2014), researchers who want to 
incorporate mHealth data in their research 
(e.g., Althoff et al. 2016; Jakicic et al. 2016; 
Fitabase!>), and app developers who want to 
integrate multiple data sources in a new ser- 
vice (e.g., Exist.io,!© Gyroscope,” and 
Instant!®). For some patients, accessing their 
personal health data is a matter of life and 
death, as in the case of Type 1 diabetes 
patients, who have been struggling for a long 
time to have direct access to their continuous 
glucose monitor (CGM) data (Kaziunas et al. 
2018). In clinical contexts, doctors can use the 
patients’ data collected outside the hospital to 
diagnose patients accurately and monitor 
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them closely (Chung et al. 2015; Kim et al. 
2017; West et al. 2016; Zhu et al. 2016). 
People can access their data collected 
from mHealth apps through downloading a 
file (e.g., CSV, FIT, GPX, KML, and TCX) 
from an app, website, or a health platform 
(e.g., Apple Health), and using application 
programming interfaces (APIs). However, it 
is typically difficult for lay people to down- 
load and repurpose the data, even if they are 
the ones who contributed to collecting them 
(Kim et al. 2019). Although recent regula- 
tions—such as the European General Data 
Protection Regulation (GDPR)!°—have the 
potential to enhance personal data accessibil- 
ity, the industry lags behind the reforms. 
Data standards are defined as “docu- 
mented agreements on representations, for- 
mats, and definitions of common data” 
(Fenton et al. 2013). They support interop- 
erability among heterogeneous systems, and 
are critical in leveraging patient-generated 
data in the current mHealth ecosystem. 
Among a few existing health platforms in the 
market (e.g., Apple Health, Google Fit, and 
Samsung Health), Apple Health and its 
frameworks (HealthKit and ResearchKit) are 
widely being adopted, and have become the 
standard interface to fitness and medical 
devices.”! Apple’ HealthKit supports data 
integration from iOS apps by allowing third 
party apps to transfer data to and from the 
HealthKit. However, data loss could happen 
during the transfer, and HealthKit is not com- 
patible with Android apps, failing to accom- 
modate more than half of the smartphone 
users. As in the case of the Open mHealth ini- 


13 Apache Cordova. 2015. The Apache Software 
Foundation. Retrieval June 13, 2019: » https://cor- 
dova.apache.org/ 

14 Titanium Mobile Development Environment. 2017. 
Axway. Retrieval June 13, 2019: » https://www. 
appcelerator.com/Titanium/ 

15 Fitabase. 2019. Retrieval June 11, 2019: » https:// 
www.fitabase.com/ 

16 Exist. 2019. Hello Code. Retrieval June 11, 2019: 
> https://exist.io/ 

17 Gyroscope. 2019. Gyroscope Innovations, 
Retrieval June 11, 2019: » https://gyrosco.pe/ 

18 Instant. 2015. Retrieval June 11, 2019: » https:// 
instantapp.today 


Inc. 


19 European Union General Data Protection Regula- 
tion. Retrieval September 9, 2019: » https:// 
ec.europa.eu/commission/priorities/justice-and-fun- 
damental-rights/data-protection/2018-reform- 
eu-data-protection-rules_en 

20 Data Standards. 2019. Public Health Data Stan- 
dards Consortium. Retrieval June 13, 2019: 
> http://www.phdsc.org/standards/health- 
information/d_standards.asp 

21 Apple is going after the healthcare industry, starting 
with personal health data. January 8, 2019. CB 
Insights. Retrieval June 13, 2019: » https://www. 
cbinsights.com/research/apple-healthcare-strategy- 
apps/ 
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tiative?? and Human API,” some efforts have 
been made in creating standard for personal 
health data schema. However, the uptake of 
the Open mHealth initiative and Human API 
has been slow, and no incentives encourage 
companies to follow the standardization just 
yet. Health platform companies and health 
app developers need to make a concerted 
effort to establish standards for personal data 
schemas to support interoperability and to 
prevent data loss. 

On the clinical informatics side, app devel- 
opers have been building 3rd party clinical 
apps—some of which are mHealth apps— 
that can access patients data directly from 
EHR by leveraging SMART on FHIR,” an 
open, standards-based platform for medical 
apps. Through SMART on FHIR, healthcare 
organizations can plug third-party medical 
apps into their EHR that use those standard 
data types, which poses immense opportuni- 
ties for clinicians and patients to use EHR 
data for diverse purposes (e.g., patient engage- 
ment, disease management, and research) on 
mobile devices. 


19.2 Key Features of mHealth 
Technologies 


A great deal of utility of mobile technology 
for health applications comes from the ways 
in which these tools can collect data about 
individuals’ health status, behavior, and con- 
text. Unique to mobile technology is its ability 
to collect data passively via various types of 
sensors embedded in mobile phones and 
wearable devices. With the number of mobile 
sensors and the quality of the inference 
increasing at a rapid rate, mobile health tools 
are able to detect a broad range of behaviors 
and states continuously and in the back- 
ground, with minimal user interaction, 


22 Open mHealth. 2015. The Tides Center. Retrieval 
June 13, 2019: > http://www.openmhealth.org/ 

23 Human API. 2019. Retrieval June 13, 2019: 
> https://www.humanapi.co/ 

24 SMART Health IT. Retrieval January 15, 2020: 
> https://smarthealthit.org/ 


enabling collection of rich, longitudinal data 
that can be used for assessment and treat- 
ment. When self-report is needed, mobile 
tools offer ways to collect self-report data in 
context and with low burden, greatly increas- 
ing the ecological validity of user-provided 
information. 


19.2.1 Sensing to Collect Data 


A key feature of many mHealth applications 
is that they make use of information gener- 
ated via sensors contained in mobile phones 
or worn on the body. Modern mobile phones 
contain a large number of sensors, including 
GPS, accelerometers, gyroscopes, cameras, 
and microphones. In recent years, the use of 
on-body sensors that continuously monitor 
individuals’ activities and states has exploded. 
Examples include wrist-worn fitness trackers 
as well as a large variety of specialty sensors 
such as those that detect galvanic skin 
response, oxygen saturation, blood glucose, 
heart rate variability, core body temperature, 
and blood pressure. 

The data from such on-body sensors are 
typically transmitted to a mobile phone, 
where, along with data collected on the phone 
itself, it is used to provide feedback to users or 
is made available to third-party applications. 
Sensor data are used for three basic purposes: 
(1) for assessing physiological processes, such 
as resting heart rate or blood glucose; (2) for 
inferring individual’s activities, such as physi- 
cal activity, eating, and sleep; and (3) for infer- 
ring context, such as location and social 
environment. 


19.2.1.1 Assessing Physiological 


Processes 
An important use of sensing in mHealth tech- 
nologies is to assess physiological processes 
and states that are important for supporting 
health management. Such sensing is usually 
done through devices that users wear on their 
bodies (e.g., bands worn on the wrist, or 
instrumented adhesive patches worn on the 
torso) or even through their phone. Regardless 
of the sensor form factor, physiological data 
obtained through mobile sensing are typically 
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transmitted to a mobile phone via a low- 
power radio (e.g., Bluetooth), where it can be 
used to drive intervention delivery through an 
mHealth app or uploaded to remote servers 
for monitoring by the healthcare team. 

The most common type of sensors used to 
monitor physiological processes are wrist- 
worn wearables. The exact set of sensors— 
and, thus, what they are able to detect—varies 
by device, but the main physiological sensor 
found in such devices is the optical heart-rate 
sensor, which uses green and orange light 
emitting diodes (LEDs) and a photodetector 
to detect the pulse waveform (Alexander et al. 
1989). Recently, heart rate sensors have begun 
to add pulse oximetry functionality as well, by 
incorporating a red LED, and, as of 2018, 
these more advanced heart rate sensors are 
beginning to show up even in devices in the 
$150 range, such as Fitbit Charge 3 and 
Garmin Vivosmart 4, which represent a large 
segment of the wearables market. 

The chief physiological metrics sensed by 
heart rate sensors are momentary heart rate 
and resting heart rate. Combining momen- 
tary heart rate data with physical activity data 
(sensed via an accelerometer), wrist-worn 
devices also try to estimate energy expendi- 
ture, although the quality of these calculations 
is variable (see Consolvo et al. 2014 for exam- 
ples of design implications of this variability 
in inference). Certain wristbands—notably, 
recent devices made by Garmin—attempt to 
characterize users’ stress levels, which can be 
estimated from the inter-beat interval data 
obtained from optical heart rate sensors 
(Hovsepian et al. 2015). Finally, wrist-based 
devices with pulse oximeters are able to detect 
blood oxygen saturation, although the current 
generation of devices do not provide continu- 
ous monitoring of this metric. 

Although wristbands are currently the most 
common form factor for commercial wearable 
sensors, the range of physiological data that 
can be sensed unobtrusively at the wrist is lim- 
ited. As such, several other form factors exist, 
including smart rings (e.g., Moodmedtric and 
Motiv), instrumented clothing (e.g., Hexoskin, 
OMsignal, and Skiin), and adhesive tags that 
can be attached to clothing (Spire). Such 
devices are able to detect other physiological 
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metrics beyond heart rate—including respira- 
tion, body temperature, and electrodermal 
activity—enabling, among other things, more 
robust assessment of stress (e.g., Moodmetric 
and Spire). 

In addition to consumer-oriented com- 
mercial devices, physiological sensing using 
wearable sensors is a thriving research area. 
The current crop of commercial devices 
described above are based on decades of sens- 
ing research, with the research on optical 
pulse oximetry dating back to the 1980s (see 
Alexander et al. 1989 for a brief review), and 
stress detection drawing on ten years of work 
on multimodal sensing (Ertin et al. 2011; 
Hovsepian et al. 2015). More recently, two key 
topics in mobile physiological sensing research 
have been cuff-less, continuous measurement 
of blood pressure and noninvasive measure- 
ment of blood glucose. 

Over the last few years, there has been a 
quickly growing body of research aimed at 
developing unobtrusive methods for blood 
pressure (BP) measurement. Much of this 
work attempts to repurpose optical heart rate 
sensors already present in the wrist-based 
activity trackers to estimate blood pressure 
from photoplethysmography (PPG) measure- 
ments. Recent work (Zhang and Feng 2017; 
Patil et al. 2017) has shown that machine 
learning techniques can be used to estimate 
both systolic and diastolic BP from PPG sig- 
nals with around 90% accuracy. Similarly, 
Carek et al. (2017) have used the accelerome- 
ter and optical heart rate sensor in a smart 
watch to estimate BP from the pulse transit 
time, with similar accuracy. Although these 
accuracy rates are still too low for widespread 
clinical applications, this line of work is 
quickly developing, and accurate continuous 
BP monitoring is likely not too distant. 

The other key target for physiological 
sensing in the context of mHealth has been 
blood glucose measurement. Traditionally, 
blood glucose could only be measured from 
blood samples, typically obtained from a fin- 
ger prick or, in the case of continuous glucose 
monitors, from a needle semi-permanently 
inserted into the subcutaneous tissue. In recent 
years, both the finger prick-based glucose 
monitors and CMGs have started to incorpo- 
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rate Bluetooth connectivity, enabling glucose 
readings to be automatically uploaded to and 
logged in a smartphone app, greatly facilitat- 
ing keeping of accurate logs and provision of 
data-driven self-management coaching. For 
instance, BlueStar (» www.welldoc.com), 
an FDA-approved app for diabetes self- 
management, uses patients’ glucose readings 
uploaded from a connected glucometer to 
provide just-in-time advice for specific actions 
patients can take to keep their glucose levels 
in the prescribed range. An early randomized 
clinical trial of BlueStar showed a 2% HbAIc 
improvement in the BlueStar arm compared 
to the .68% improvement in the control condi- 
tion (Quinn et al. 2008). As with BP measure- 
ment, recent research has focused on making 
blood glucose measurement more convenient 
and less invasive. Researchers have explored 
multiple approaches to detecting glucose 
levels non-invasively, including optical, by 
shining light into the skin to detect changes 
in glucose concentration in the blood, chemi- 
cal, to detect glucose levels in the saliva, and 
electrochemical, to detect glucose via a smart 
contact lens (see Eadie and Steele 2017 for a 
review). A number of systems that are using 
these approaches are currently undergoing 
clinical trials. 

Finally, many research projects aim to use 
sensors found in mobile phones to enable 
detection of physiological processes that have 
traditionally required expensive, medical- 
grade equipment. Among other metrics, proj- 
ects in this category used cell phone sensors to 
detect lung function via the phone micro- 
phone (Larson et al. 2012), hemoglobin con- 
centration in the blood using the phone 
camera (Wang et al. 2016), intraocular pres- 
sure from the video captured on a smartphone 
(Mariakakis et al. 2016), and blood alcohol 
level from interaction patterns with app user 
interfaces (Mariakakis et al. 2018). The pur- 
pose of projects such as those listed above is 
to make expensive medical tests that require 
specialized equipment cheaper and more 
accessible to decrease health disparities and 
increase access to care. 


19.2.1.2 Inferring Activities 


A key role of sensing in the context of mobile 
health is detection of individuals’ activities 
and states. The most common application of 
sensing in this domain has been detection of 
physical activity. Accelerometers and gyro- 
scopes are found not only in wearable fitness 
trackers but also in smartphones, as well in a 
growing segment of “hybrid” watches—stan- 
dard quartz or mechanical timepieces which 
contain a small number of sensors and ability 
to receive phone notifications. Accelerometers 
in all these devices are continuously assessing 
users’ movements to determine how active 
they are. The chief metric derived from this 
data is the number of steps that an individual 
has taken, but many devices and mobile apps 
also combine accelerometer data with heart 
rate data to estimate the number of active 
minutes, usually calculated to correspond to 
minutes of moderate to vigorous physical 
activity (MVPA), a metric that is commonly 
used in physical activity guidelines (e.g., Piercy 
et al. 2018). Additionally, more advanced 
wearables are increasingly attempting to 
detect exactly which physical activity a user is 
performing (e.g., biking, swimming, elliptical, 
etc.) to help users keep more accurate exercise 
logs and improve calorie expenditure estima- 
tion. As with many other areas of mobile 
health, detection of specific physical activities 
was pioneered in the research setting 
(Choudhury et al. 2008), but its accuracy has 
only recently become sufficiently high to make 
it feasible for broad inclusion in commercial 
devices. 

Another common activity detected via 
mobile devices is sleep. Most wrist-based 
activity trackers (from Fitbit, Garmin, etc.) 
track sleep using accelerometry, and many of 
them try to categorize different stages of sleep 
as well, providing users with summaries of the 
amounts of time spent in deep, light, and 
REM sleep. A recent systematic review of reli- 
ability and validity of consumer activity 
trackers found that these devices typically 
overestimate total sleep time and sleep effi- 
ciency when compared to polysomnography 
but underestimate wake-after-sleep onset 
(Everson et al. 2015). Given that these devices 
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estimate sleep duration using wrist-worn 
accelerometers, part of the problem appears 
to be that evening activities that involve little 
movement (reading, watching a movie, etc.) 
can be confused for sleep, leading to the over- 
estimation of total sleep time. 

An important focus of recent work has 
been automated detection of eating. Diet track- 
ing has been consistently shown to be one of 
the most effective self-management strategies 
for weight loss (Michie et al. 2009; Webb et al. 
2010), and it is of great importance to a range 
of epidemiological research. Yet, consistent 
manual diet tracking is notoriously difficult to 
achieve over extended periods of time. For this 
reason, a great deal of recent research has 
focused on trying to automate diet tracking in 
various ways. mHealth researchers have taken a 
number of approaches to address this problem. 
One approach has been to simplify tracking by 
enabling users to take pictures of their meals 
using the phone’s camera. To extract informa- 
tion about what was eaten, the photo-based 
approach relies either on computer vision (e.g., 
Kitamura et al. 2010), or on crowdsourcing, 
where the food images are analyzed through a 
sequence of tasks assigned to workers on 
Amazon Mechanical Turk (Noronha et al. 
2011). The photo-based approach still requires 
active user engagement, but the user burden is 
reduced compared to manual logging, albeit at 
a higher financial cost (for crowdsourcing) or at 
the cost of lower data accuracy (for computer 
vision-based approaches). 

As an alternative to photo-logging of food 
intake, recent research has attempted to facili- 
tate diet tracking by automating detection of 
eating episodes. To do so, researchers have 
used a number of sensing approaches, includ- 
ing using microphones to detect the sound of 
chewing and swallowing (Alshurafa et al. 
2014; Makeyev et al. 2012), using wrist-based 
accelerometers to detect hand movements 
indicative of eating or drinking (Amft et al. 
2005; Kyritsys et al. 2017), and using a small 
neck-worn camera to detect when food is 
brought to the mouth (Sun et al. 2014; Chen 
et al. 2013). Unlike photo-logging, which aims 
to produce the same kind of data as tradi- 
tional manual diet tracking (a log of the foods 
eaten with corresponding micro and macronu- 
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trient information), these approaches focus on 
detecting the activity of eating, rather than 
detecting what is eaten. As such, detection of 
eating currently does not result in accurate 
nutrient information, although it can provide 
some information about the types of food 
eaten (fresh fruit and vegetables are crunchier 
and require more chewing than, say, a ham- 
burger), as well as the amount (by virtue of 
detecting the total number of eating episodes 
and their duration). Even with this limitation, 
automatic eating detection can be very useful 
for a range of mobile health applications, 
from supporting independent living of the 
elderly by verifying they are eating regularly, 
to interventions for weight management that 
attempt to decrease snacking and mindless 
eating. 

mHealth systems have used sensor data to 
detect a range of other activities, including 
medication adherence via instrumented pill 
bottles (Hayes et al. 2006; Abbey et al. 2012), 
falls (Dai et al. 2010; Fang et al. 2012) smok- 
ing (Ali et al. 2012; Parate et al. 2014), and 
activities of daily living, such as washing and 
cooking. The latter category is particularly 
relevant for supporting independent living, 
but detecting such activities with a high level 
of precision often involves combining mobile 
devices with instrumenting the environment, 
such as adding RFID tags to common objects 
like pots (Buettner et al. 2009), or using cam- 
eras (Liu et al. 2014). 


19.2.1.3 Inferring Context 

In addition to detecting physiological pro- 
cesses and user activities, mobile devices are 
commonly used to detect user’s context to tai- 
lor intervention delivery to the user’s current 
situation. The key contextual variable used by 
mHealth apps is location. Both iOS and 
Android contain robust location capabilities 
that leverage GPS, Wi-Fi, and cell tower infor- 
mation to determine user’s location in a 
battery-efficient way and enable third-party 
applications to obtain location information 
even while the application is in the back- 
ground (i.e., the phone screen is off or the user 
is interacting with another application). These 
capabilities enable mHealth apps to identify 
nearby resources (e.g., healthy restaurants or 
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emergency rooms), and to provide interven- 
tion content that is appropriate to the user’s 
current context. For instance, HeartSteps, an 
mHealth physical activity intervention, sends 
user’s activity suggestions that are tailored to 
their location, weather, time of day, and day 
of the week (Klasnja et al. 2018). 

Similarly, MyBehavior (Rabbi et al. 2015), 
another physical activity intervention, tracks 
users’ daily movements and then, based on the 
current location, provides recommendations 
for walking routes a user can take to increase 
their steps, balancing the user’s step goal with 
the suggested activity’s feasibility (e.g., how 
much time it would take). In the addiction 
space, location is often used to determine 
high-risk situations (e.g., proximity to a bar) 
to provide just-in-time support, such as cop- 
ing strategies (e.g., see Gustafson et al. 2014). 
Such just-in-time adaptive interventions 
(JITAIs), enabled by knowledge of the indi- 
vidual’s current context and activity, hold 
great promise for achieving goals of precision 
behavioral health by providing support only 
when it’s most likely to be effective and when 
individuals are receptive to it (Nahum-Shani 
et al. 2015; Spruijt-Metz and Nilsen 2014). 

Other contextual variables that can be pas- 
sively sensed by current smartphones include 
weather (by lookup via the user’s location), 
the user’s calendar (e.g., free/busy blocks), 
ambient noise levels, and proximity to other 
people. A number of research efforts (Kanhere 
2011; Devarakonda et al. 2013) have exam- 
ined incorporation of chemical sensors into 
the phone that would allow sensing of pollu- 
tion levels. Although such pollution levels can 
currently be approximated by looking up pol- 
lution data based on the user’s location, sens- 
ing on the phone could enable more timely and 
potentially more effective just-in-time support 
for individuals with asthma and other respira- 
tory problems. 


19.2.2 Collecting Self-Reported 
Data 


Across physiological processes, user activities, 
and context, mobile devices are increasingly 
able to detect information about a person that 


can be used to characterize that person’s 
health-related needs and provide timely sup- 
port to address those needs. However, not 
everything can be sensed, and a range of 
important metrics needed to monitor patients’ 
health (e.g., pain levels and medication side 
effects) and support self-management require 
self-report data. For these cases, mobile 
devices enable self-report that is precise, 
timely, and minimally burdensome. Collecting 
and leveraging in-situ self-report data using 
mobile devices has recently gained tremen- 
dous interest among researchers in multiple 
fields, such as ubiquitous computing, behav- 
ioral sciences, psychology, public health, and 
machine learning. 


In-Situ Data Collection 
Methods 


Research methods such as diary study and 
ecological momentary assessment (EMA) 
(Shiffman et al. 2008; Stone et al. 2007) help 
researchers understand details about a per- 
son’s context, intentions, and actions that tra- 
ditional research methods (e.g., interview, 
survey, and system log analysis) cannot reveal. 
Diary study and EMA share the same pur- 
pose of collecting data in people’s natural 
environment to maximize ecological validity. 
In diary study, people capture self-report 
data—usually once a day—using either pen 
and paper, mobile diary apps, or online sur- 
vey. Some diaries are structured (e.g., sleep 
diary for Cognitive Behavioral Therapy for 
Insomnia) whereas others are less structured 
(e.g., free-form note). Although diary study 
helps researchers collect in-situ data relatively 
easily, it is subject to low adherence (if par- 
ticipants forget to fill it out) and recall bias (if 
there is a large time interval between when an 
event happened and when the event was cap- 
tured). More recently, researchers increasingly 
use EMA, which refers to frequent, brief col- 
lection of self-report data about the person’s 
current situation and experiences. As a data 
collection method, EMA is intended to 
decrease recall biases and memory limitations 
inherent in retrospective self-report, while— 
by collecting self-report frequently (often 5 to 
8 times a day)—enabling researchers to under- 
stand how behavior, psychological processes, 
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and individuals’ experiences change over time 
and are influenced by time-varying factors 
such as the person’s physical environment. 


19.2.2.2 Mobile Research Platforms 
for In-Situ Data Collection 


Mobile devices are particularly well suited for 
supporting diary and EMA studies in multi- 
ple ways. Researchers and commercial com- 
panies have developed a number of software 
platforms for collecting in-situ data that 
enable researchers to create data-collection 
schedules with no or minimal programming. 
These tools typically support to the collection 
of self-report data, passive logging of smart- 
phone sensors (e.g., GPS, bluetooth, device 
status, and activity), or some combination of 
both (semi-automated tracking) (Choe et al. 
2017). 

One of the early projects of this kind, 
MyExperience (Froehlich et al. 2007) ran on 
Windows Mobile device, and it enabled 
researchers to construct sophisticated EMA 
surveys and schedules by writing a single con- 
figuration file in XML. MyExperience surveys 
could collect traditional questionnaire-style 
data (i.e, multiple-choice questions, text 
responses, etc.), as well as use the phone’s cam- 
era and microphone to collect rich multimedia 
data. They also allowed complex survey sched- 
ules where different questionnaires were deliv- 
ered to participants at different times and 
based on different triggering conditions. 

Many similar systems have since been 
developed for different mobile platforms, and 
researchers can now choose among both 
commercial solutions where survey 
configuration is done by a commercial 
company that has developed a proprietary 
EMA platform (e.g., Life Data Corp and 
Ilumivu) and open-source platforms that 
researchers can configure and deploy 
themselves (e.g., Momento (Carter et al. 
2007), MyExperience (Froehlich et al. 2007), 
AWARE (Ferreira et al. 2015), Jeeves (Rough 
and Quigley 2015), PACO (Evans 2016), 
Sensus (Xiong et al. 2016), Extrasensory App 
(Vaizman et al. 2018), OmniTrack (Kim et al. 
2017), and TEMPEST (Batalas et al. 2018)). 
Among the latter group, the most mature 
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current offering is PACO,” a cross-platform 
(Android and iOS) system built by the Google 
engineer Bob Evans and his colleagues. Like 
MyExperience, PACO supports construction 
of multiple surveys and schedules, a broad 
range of question types, and sophisticated 
questionnaire triggering. All configuration of 
PACO questionnaires can be done through a 
web interface without any programming, and 
its website provides study management func- 
tionality, including enrolling participants, 
monitoring their adherence, and downloading 
response data in standard formats for 
statistical analyses. To further extend people’s 
data capture capability, OmniTrack enables 
people to create an Android native tracking 
app without programming (Kim et al. 2017). 
By combining various data fields and 
integrating external data services (e.g., Fitbit) 
into a single tracking app, people can create a 
customized tracking app and configure the 
app to be used as a general diary app or an 
EMA tool. Researchers can design and deploy 
an OmniTrack app using OmniTrack for 
Research (see @ Fig. 19.5), which handles app 
deployment, updates, and participant 
monitoring.” The ability to construct data 
capture instruments and conduct studies from 
the web with no programming greatly increases 
opportunities for researchers outside of 
technical disciplines to integrate rich in-situ 
data into their research projects. 

Many of the EMA platforms support not 
just traditional time-based prompting where 
questionnaires are triggered based on time, 
but also event-based prompting, where ques- 
tionnaires are triggered when the phone 
detects that a particular kind of event has 
occurred. To do so, EMA systems use sensors 
inthephone and connected devices to monitor 
users’ state and behavior, and then trigger a 
questionnaire when a particular set of 
conditions are met. The sensors and types of 


25 PACO: The Personal Analytics Companion. 
Retrieval June 13, 2019: » https://pacoapp.com/ 

26 Kim YH, Lee B, Choe EK, Seo J. 2019. OmniTrack 
for Research. GitHub. Retrieval June 13, 2019: 
> https://omnitrack.github.io/research/ 
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lection apps equipped with both manual and automated data collection functionalities 


events that can be used for triggering vary 
from platform to platform, but it is common 
to be able to define events based on location 
(e.g., a questionnaire can trigger when the per- 
son gets home or stops moving), phone activ- 
ity (a questionnaire can trigger after the 
person finishes a phone call), or data from 
accelerometers (a questionnaire can trigger 
when the person finishes a physical activity). 
Event-based prompting allows researchers to 
collect data at precise times when an event of 
interest (e.g., physical activity) occurs, maxi- 
mizing their ability to get timely information 
about the person’s experience and surround- 
ing events and minimizing recall biases and 
the risk of forgetting. 

Finally, some EMA platforms enable not 
just collection of EMA data but also the use 
of the provided answers to trigger interven- 
tions. Such ecological momentary interven- 
tions (EMIs; Heron and Smyth 2010) are 
particularly useful in the substance use arena 
(e.g., Gustafson et al. 2014; Dennis et al. 
2015), where a person’s answers to EMA ques- 
tions can be used to calculate risk of relapse, 
which, if it reaches a predefined threshold, 
can trigger coping interventions or facilitate 


contact with a recovery coach. The ability of 
EMA systems to trigger a request for self- 
report at times of high risk (e.g., when a per- 
son recovering from alcoholism is near a bar) 
and immediately respond to provided answers 
enables the kind of tailored, just-in-time sup- 
port that was impossible prior to the develop- 
ment of modern mobile technology. 


19.2.2.3 Incorporating 
Self-Reporting Into 

User Interactions 

In addition to diary and EMA studies, mobile 
technology can facilitate collection of self- 
report by incorporating self-reporting into 
interactions that users already perform on 
their mobile devices. Particularly interesting 
attempts to do this involved appropriating the 
unlock gesture on Android phones to enable a 
user to provide a single piece of self-report 
data as part of the process of unlocking the 
phone. The first project that took route, Slide 
to X (Truong et al. 2014), found that 
individuals unlocked their phones between 20 
and 100 times per day on average. To take 
advantage of this frequent interaction, Truong 
et al. built three applications that turned the 
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phone’s standard unlocking interface into a 
data-collection tool. One of them, Slide To 
QuantifySelf, enabled users to answer a single 
question (e.g., “how happy are you right 
now?”) on a Likert scale in place of using per- 
forming a standard unlock gesture. Users 
could specify multiple questions with which to 
be prompted, as well as when during the day 
each question should be asked (e.g., asking 
about whether they ate breakfast mid- 
morning), providing a low-burden way to col- 
lect a rich self-report dataset that can help an 
individual better manage her health. A recent 
project called LogIn (Zhang et al. 2016) 
expanded on this idea by developing gesture- 
based unlock interactions to track pleasure 
and accomplishment, sleepiness, and mood. 
Unlike Slide to QuantifySelf, which used a 
single Likert scale item to record self-report, 
Loglt used more sophisticated gesture-based 
interactions, such as measuring mood on 
Russell’s affect grid (Russell et al. 1989). 
Self-report can be incorporated into appli- 
cations as well. Here too the underlying idea is 
to decrease the burden of providing self- 
report by tying it to an action the user is 
already performing. In HeartSteps (Klasnja 
et al. 2018), an mHealth physical activity 
intervention, users can receive prompts to go 
for a brief walk or move to disrupt prolonged 
sitting. Like many mHealth interventions, 
these prompts are provided as push notifica- 
tions to the user’s phone. Other than dismiss- 
ing them through a simple OK button, 
however, HeartSteps provides users with three 
different buttons to dismiss the notification, 
each intended to indicate how the user per- 
ceived that particular prompt. Users can press 
a thumbs-up icon to indicate that they liked 
the activity suggestion and that it came at a 
good time, a thumbs-down icon to indicate 
that the suggestion was not helpful or came at 
a bad time, and a button to turn off future 
prompts for a certain period of time, indicat- 
ing that they will be busy or unavailable for 
the intervention. The data from a study of 
HeartSteps showed that participants were sig- 
nificantly more active after the prompts they 
marked thumbs-up than those to which they 
responded in either of the two other ways, 
suggesting that the self-report obtained in this 
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manner is a useful indicator of individuals 
receptivity to intervention.*’ In the new ver- 
sion of HeartSteps, these data are being used 
by a learning algorithm that personalizes 
intervention provision for each HeartSteps 
user. Adding a single question to an applica- 
tion dashboard, or triggering a question by 
observing the use of other applications (e.g., 
when a user quits a social media app) are 
other ways when interactions already taking 
place on the phone can be leveraged to collect 
self-report data without the need for addi- 
tional disruptive prompting of the user. 


19.2.2.4 Easing Data Entry 


Finally, mobile devices offer a number of 
ways to ease data entry during self-report. As 
we mentioned, most EMA platforms enable 
users to take pictures and use the phone’s 
microphone to speak their answers to the 
questions. Such multimedia responses enable 
collection of rich qualitative data that would 
be too burdensome or even impossible to col- 
lect by requiring users to type their responses 
into a text form. Multimedia capture enables 
additional types of data analyses that cannot 
be done on traditional questionnaire data. 
For instance, PlateMate (Noronha et al. 2011) 
lets users track what they are eating by taking 
pictures of meals. The system then uploads 
these images to Mechanical Turk, where they 
are processed through a number of steps that 
extract detailed nutritional information from 
the images, enabling logging of calories and 
micro-nutrients with much lower self-report 
burden than is involved in traditional 
database-based nutritional logging. Similarly, 
machine learning algorithms can be used to 
process audio recordings of data individuals 
enter by speaking into the phone not only to 
extract content of those recordings, but also 
to detect user’s mood or features such as 
latency or pitch that can be indicative of 
changes in the individual’s mental health, 
such as an onset of a manic episode (e.g., 
Gideon et al. 2016). 


27 Predrag 
8/31/2019. 


Klasnja, personal communication, 
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Self-report can be made more convenient 
by placing self-reporting interfaces into easily 
accessible locations and minimizing how 
much user interaction is required. Choe et al. 
(2015) enabled low-burden self-monitoring of 
sleep by placing a self-tracking widget on the 
phone’s lock screen, making it visible every 
time a person reached for her phone, and by 
recording sleep quality via a single tap. 
Similarly, Fitabase Engage, a new platform 
from Fitabase, enables simple questions to 
show up on individuals’ Fitbit smartwatches, 
and the questions to be answered with a single 
tap. The speed of this interaction, and not 
needing to reach for the phone, greatly 
decreases the perceived burden on answering 
questions delivered via Fitabase Engage. 


19.2.3 Providing Interventions 
in Individuals’ Daily Lives 


By facilitating passive data collection via sen- 
sors embedded in the phone and wearable 
devices, and by providing ways to collect brief 
self-report data at the right time and in the 
right context, mobile technology has greatly 
enhanced collection of data that are needed to 
understand health behaviors, to monitor 
patients’ health, and to provide interventions. 

We now turn to how this information can be 

used by mHealth systems to provide interven- 

tions at times when those interventions are 
most needed and when individuals are most 
receptive to them. 

Given the immense number of mHealth 
apps, their content can be categorized in a 
number of different ways (e.g., see Klasnja & 
Pratt, 2012 for one classification). For simplic- 
ity, many mHealth apps can be seen as falling 
into one or more of the following five catego- 
ries: 

1. Reminders. One of the simplest functions 
of mHealth interventions is to provide 
reminders. Reminders exist for many dif- 
ferent health behaviors, including attend- 
ing scheduled clinic visits, taking 
medications, and applying sunscreen on 
sunny days. Sometimes reminders are a 
stand-alone intervention (e.g., a text mes- 
saging intervention to increase adherence 


to clinic visits (Koshy et al. 2008; Leong 
et al. 2006)), while other times they are 
embedded into more complex interven- 
tions, such as those for chronic disease 
management. 


. Support for behavior change. A large num- 


ber of mHealth interventions are intended 
to help individuals make health-promoting 
changes in their behavior. mHealth inter- 
ventions exist for improving medication 
adherence, increasing wellness behaviors 
like physical activity and healthy diet, 
helping with cessation of addictive behav- 
iors like smoking and substance use, pre- 
venting relapse, and adhering to health 
management practices like monitoring of 
glucose or blood pressure. Some of the 
applications for the management of men- 
tal health (e.g., depression and bipolar dis- 
order) fall into this category as well, as 
they often focus on helping individuals 
enact therapeutic practices like behavioral 
activation, or increase the regularity of 
sleep, social contact, and other behaviors 
that support mental well-being. 


. Discovery of patterns. A related category 


is mHealth interventions that support 
individuals in discovering patterns in their 
behavioral or physiological responses. For 
instance, TummyTrials (Karkar et al. 
2017) is a recent mHealth intervention 
intended to help individuals detect food 
triggers that aggravate irritable bowel syn- 
drome. Other interventions, like Health 
Mashups, (Bentley et al. 2013), try to help 
individuals understand what factors influ- 
ence their physical activity, sleep, and 
other wellness behaviors. Although inter- 
ventions in this category do not have 
explicit behavior-change features, such as 
goal-setting or planning, the self- 
experimentation they support is usually in 
service of making health-promoting 
changes in one’s behavior, making this cat- 
egory a close relative of the behavior- 
change interventions. 


. Detection or prediction of critical health 


events. An important use of sensors and 
self-report data in the mHealth context is 
to enable interventions that detect and/or 
predict critical health events that may 
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require prompt medical attention. 
mHealth solutions exist to detect a broad 
range of acute health events, including 
falls, alarming levels of chemotherapy tox- 
icity, imminent risk of a heart attack, onset 
of a manic episode, and decompensation 
in chronic heart failure. Depending on the 
severity of the detected event, such inter- 
ventions either provide guidance to the 
individual on how to manage the event or 
automatically contact the emergency ser- 
vices to get the individual medical help as 
quickly as possible. 

5. Communication with the healthcare sys- 
tem. A growing number of mHealth inter- 
ventions are intended to facilitate 
communication with the health system 
through the support of remote patient 
monitoring, secure messaging, prescrip- 
tion refills, accessing labs and imaging, 
and scheduling appointments. Unlike the 
categories reviewed above, mHealth apps 
in this category are almost always provided 
by and tied to a particular health system, 
pharmacy, or a clinic. To make such 
mHealth interventions work, clinical 
workflows often have to be restructured to 
accommodate the use of mHealth tools by 
patients and the data that are generated as 
part of that use. 


Although the above categories are not 
intended to be exhaustive, they do cover many 
of the common types of mHealth applica- 
tions. Thus, they demonstrate the range of the 
intervention work that has been done in the 
field of mHealth. 


19.2.4 Providing Just-in-Time 
Adaptive Interventions 


From the earliest days of mHealth, a major 
promise of mobile technology, given its con- 
stant proximity to the person, has been its 
ability to provide support for health manage- 
ment when that support is most needed (Intille 
2004; Patrick et al. 2008; Nilsen et al. 2012): at 
times when a person is at risk (e.g., of a sub- 
stance use relapse) and needs help with cop- 
ing, when there is an opportunity to engage in 
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health-promoting behaviors, or when the sen- 
sors detect that there is a critical change in the 
person’s health (e.g., high likelihood of a heart 
attack) and she needs to receive immediate 
medical attention. The development of such 
just-in-time interventions is made possible by 
continuous, low-burden data collection, 
always-on Internet connectivity, powerful 
phone-based information processing, and the 
ability to prompt the person via a push notifi- 
cation or a text message to deliver an interven- 
tion. A great deal of recent work on mHealth 
interventions has aimed to realize this prom- 
ise of smart, timely intervention delivery. 
Although the promise of timely, in-context 
information has been acknowledged for many 
years, it is only recently that the technical, 
algorithmic, and methodological develop- 
ments have enabled the development of inter- 
ventions that realize this goal. In recent 
literature, such mHealth interventions have 
been referred to as just-in-time adaptive inter- 
ventions (JITAIs; Nahum-Shani et al. 2015; 
Nahum-Shani et al. 2016; Spruijt-Metz and 
Nilsen 2014). JITAls refer to mHealth sys- 
tems that use decision rules—if-then rules or 
algorithms that specify when, where, and how 
interventions are delivered to individuals—to 
attempt to provide the right type of support at 
the right times and in the right contexts. 
JITAIs use sensors and low-burden self-report 
to continuously monitor individuals’ state, 
behavior, and the environment, and when they 
detect that an individual is in a state of high 
risk or has an opportunity to engage in a 
health-promoting behavior, they make a deci- 
sion about whether to intervene. How these 
decisions are made varies. Simple JITAIs use 
deterministic if-then rules that determine 
what the system should do when a situation of 
risk or opportunity is detected. For instance, a 
JITAI might send a person recovering from 
alcohol use disorder a push notification with a 
coping strategy every time that person is 
within a certain distance of a bar (Gustafson 
et al. 2014). Or the decision rules may be sto- 
chastic, where intervention is not provided 
every time a situation of risk or opportunity is 
encountered but only with certain 
probability— usually to reduce user burden 
(e.g., Klasnja et al. 2018). Finally, the system 
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may incorporate algorithms that evolve the 
decision rules over time to maximize their 
effectiveness for each individual (we will 
return to this point shortly). 

What form the decision rules take depends 
on the nature of a JITAT’s intervention com- 
ponents, as well as the situations it is trying to 
target. For situations that occur frequently— 
such as getting stressed or high likelihood of 
lighting up a cigarette—interventions have to 
be spaced out to manage user burden and 
reduce the risk of system abandonment. 
Stochastic rules are well suited for meeting 
these criteria, as they can reduce the overall 
number of interventions while creating a sense 
of unpredictability, which may increase the 
effectiveness of the interventions by reducing 
habituation. On the other hand, for risk states 
that occur rarely—or for which the conse- 
quences of not intervening are severe (the risk 
of suicide being an extreme example)—simple 
deterministic decision rules may be both appr- 
opriate and adequate. 

As we mentioned above, decision rules can 
also evolve over time. Although standard 
deterministic and stochastic decision rules do 
not change over the course of system use and 
are typically the same for all users of a system, 
mHealth systems with evolving rules aim to 
personalize deliver of interventions for each 
system user. The idea is that the system learns 
patterns in user behavior and intervention 
response over time, and it can then adjust how 
it provides interventions to maximize their 
effectiveness for each individual user and to 
minimize user burden. 

Two approaches to intervention personal- 
ization are currently being investigated: rein- 
forcement learning and control systems 
engineering. JITAIs based on reinforcement 
learning (RL) use algorithms to continually 
adjust the probability of intervening based on 
observations of the outcome of previously 
delivered interventions (Sutton and Barto 
1998). Observations of successful outcomes 
lead the algorithm to increase the probability 
of intervening in the similar context in the 
future; unsuccessful outcomes decrease that 
probability. RL algorithms can focus either on 
short-term outcomes—the most immediate 
outcomes of individual intervention provi- 


sions (e.g., the number of steps a person walks 
on a day when the system sent a motivational 
message in the morning)—or long-term out- 
comes that prioritize how the system performs 
over time (e.g., the user’s average daily step 
count over the course of a month). Much of 
the foundational research in RL has been 
done in areas like robotics where both the def- 
inition of success and the relevant state vari- 
ables are relatively unambiguous. As human 
behavior is inherently more messy and the sys- 
tem’s knowledge of the person’s state and 
environment is far more noisy, it’s an open 
research question whether mHealth systems 
would work better by employing algorithms 
that focus on clearer but shorter-term out- 
comes, or if the more sophisticated algorithms 
that focus on the long-term can be made to 
work in this setting. 

The other approach to personalized JITAIs 
draws on control systems engineering (Hekler 
et al. 2018; Phatak et al. 2018). Control sys- 
tems engineering focuses on the development 
of systems that are capable of controlling com- 
plex processes, such as the flight of an airplane 
or blood glucose metabolism. At the heart of 
this approach is the development of mathe- 
matical models—called dynamical systems 
models—that encode what is known about the 
influences on the process or behavior that 
needs to be controlled. Those models are then 
used by control systems such as the plane auto- 
pilot or the artificial pancreas to make deci- 
sions about how the system should intervene to 
maximize the likelihood that the process will 
behave in a desired way (the flight will get to 
the destination city, the blood glucose will be in 
the healthy range, etc.). In the context of 
mHealth, the approach is being used to form 
dynamical models of health behaviors (e.g., an 
individual’s daily steps), and the models are 
then used by an mHealth system’s “controller” 
to decide when and how to intervene to move 
the behavior in the desired direction (e.g., to 
increase the daily step count). What is notable 
about this approach is that the controller is 
constantly updating its model based on 
continual observation of the person’s behavior 
and response to interventions, so that 
subsequent intervention decisions are always 
based on the updated model, one that 


mHealth and Applications 


progressively better describes the idiosyncra- 
sies of each user’s behavior. Although control 
systems engineering has been used successfully 
to manage complex processes for decades, its 
application to mobile health is still in its 
infancy (see Hekler et al. 2018 for the state of 
the field). 


19.2.5 Supporting 
Self-Experimentation 


We see a growing interest in supporting 
patients and laypeople to design and conduct 
self-experimentation in personalized health. 
Although traditional clinical studies (e.g., 
epidemiological surveys, longitudinal stud- 
ies, and randomized controlled trial) provide 
relevant knowledge at the population level, 
they do not provide the necessary knowledge 
for any given individual. On the other hand, 
in self-experimentation (or n-of-1 trials), an 
individual serves as their own control, allow- 
ing them to systematically explore a specific 
hypothesis (e.g., Does caffeine impact my 
sleep?) of their interest. To support laypeople 
in designing and conducting scientifically 
rigorous self-experimentations, researchers 
designed self-experimentation platforms leve- 
raging mobile devices’ continuous monitoring 
and capture capabilities. PACO and IFTTT? 
can be used as a general-purpose self-exper- 
imentation platform due to their random 
notification and tracking features (Evans 
2016). Domain-specific self-experimentation 
platforms can provide customized support. 
For example, TummyTrials (Karkar et al. 
2017) helps people identify food triggers, and 
SleepCoacher (Daskalova et al. 2016) helps 
identify connections between potential sleep 
disruptors and sleep quality. Despite some 
underlying limitations of self-experiment- 
ations—such as carryover effects and bli- 
nding—self-experimentation augmented with 
mHealth technologies has great potential to 
leverage personal knowledge and advance 
personalized medicine (Lillie et al. 2011). 


28 IFTTT. Retrieval June 13, 2019: » https://ifttt.com/ 
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19.3 Broad Considerations 
and Challenges 


All health technologies have associated prag- 
matic challenges and ethical issues regarding 
privacy and security, impact on people’s work, 
the potential to increase health disparities, 
and regulatory issues. In this section, we focus 
on aspects of those issues that are unique to 
or particularly problematic for mHealth 
applications. 


19.3.1 Privacy and Security 


For mHealth technologies, the key privacy 
and security concerns center around the per- 
sonal data that can be obtained from a device 
that is always or nearly always with you. 
Information that could be collected from 
one’s mobile device includes a great deal that 
people often want to keep private, such as 
their location, time spent with the device on, 
other apps used (including the duration and 
frequency of use), websites visited, search 
terms used, etc. Often those data are stored 
“in the cloud,” making it vulnerable to hack- 
ers, and allowing the applications to sell or use 
the data for other purposes. An alternative to 
such cloud-based systems would be to store 
all the information only on one’s mobile 
device, so that no other system has access to 
it. However, then the user is vulnerable to the 
loss of such important data if the mobile 
device is lost or damaged. Although many 
applications claim to use or sell only aggre- 
gate, anonymized data, the amount and kind 
of data collected from one’s mobile device 
make it particularly vulnerable to reidentifica- 
tion. In addition, people tend to agree to 
terms of service without reading them, and 
thus, likely have little knowledge about what 
data are available to others. 

An additional challenge for mHealth 
applications comes from its ubiquitous 
nature—others could easily observe notific- 
ations or reminders on someone’s mobile 
screen because the screen is often visible to 
many others throughout the day. Much harm 
can come from the disclosure of such private 
information, whether it is from observing the 
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information on the screen or from gathering 
stored data. For example, location logs could 
be used to disclose businesses or locations a 
person has visited when and or for how long, 
or to predict where a person will be when, 
which could endanger someone who needs to 
conceal their whereabouts for safety reasons. 
Although most mobile operating systems now 
attempt to make such location tracking 
optional, many mHealth apps need such data 
to function fully. For example, fitness trackers 
need location data to accurately record users’ 
activity. Thus, users often must choose 
between using an mHealth application and 
keeping their data private. 


19.3.2 Changes in Clinician Work 


The proliferation of these mHealth apps 
brings the potential for substantive changes in 
clinician work as well as for clinician-patient 
interactions. Because many people are now 
using mHealth apps to generate huge amounts 
of detailed health data, people could expect 
clinicians to use those data to gain new 
insights into a person’s health and positively 
influence their care. However, the volume and 
variety of data as well as applications that 
generated the data make it challenging for cli- 
nicians to incorporate the new data into their 
workflow. A literature review of empirical 
studies of self-tracking tools identified many 
clinician work-related concerns about infor- 
mation quality and the lack of standards for 
representing or viewing that data (West et al. 
2016, 2017). Yet, recent research points to the 
value of human-centered design approaches 
to addressing these concerns. For example, 
one study of DataMD showed that such an 
approach could help clinicians develop a new 
workflow that would allow them incorporate 
this kind of data and improve their counseling 
skills and support more in-depth conversa- 
tions (Kim et al. 2017). 

Yet, irrespective of good design, the real- 
time sensing nature of the data creates other 
workflow concerns, particularly ethical and 
legal expectations that clinicians respond to 
the sensed data promptly. For example, men- 
tal health providers could be expected to 


intervene immediately when a mobile sensor 
indicates a high likelihood that their client 
could harm themselves or others. Although 
such responsiveness could improve outcomes, 
these expectations could lead to further prob- 
lems with clinician burnout and challenges 
with patient autonomy and confidentiality. 
Physicians are already experiencing a greater 
increase in burnout and reduction in satisfac- 
tion with work-life balance than peer adults in 
the U.S. (Shanafelt et al. 2015). However, a 
scoping review of physician well-being in the 
mHealth context showed that these technolo- 
gies are playing important roles in the 
improvement of physicians’ well-being too 
(Chen et al. 2018). Such advances are addi- 
tionally changing both the nature and amount 
of patient-clinician interactions. We have 
much to learn about how clinicians’ use of 
these new mHealth technologies and their 
response to patients’ use of the technologies 
will affect clinicians’ work. 


19.3.3 Changes in Patient Work 


This increasing prevalence of mHealth apps 
are changing the work and personal lives of 
the people who use these tools to assist in 
their everyday health and well-being. With 
constant sensing of our health comes the 
potential for unhealthy disruptions to daily 
life. Even simple apps that count steps for 
physical fitness typically interrupt people 
when they have reached their step goal or 
remind people to get up and move during the 
day. Although these rewards and reminders 
can help people meet their health goals, such 
disruptions can come at inopportune times or 
places and negatively impact people’s overall 
well-being. Many apps allow people to disable 
those disruptive features, but then they risk 
missing important information or reminders 
that could be key to its successful use. Frequent 
tracking and constant reminders of one’s 
health can also lead to detrimental obsessions. 
For example, one study examined the effect of 
tracking technology on college students and 
found that those students who used fitness 
trackers had higher levels of eating concerns 
and symptoms of eating disorders (Simpson 
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and Mazzeo 2017). Other studies have shown 
that such tracking can make healthy activities 
that people used to enjoy feel more like work 
and lead to a decrease in those activities 
and subjective well-being (Etkin 2016). 
Furthermore, people could become overly 
reliant on the objectively sensed but not neces- 
sarily accurate data and discount their own or 
others’ subjective experiences.” 

Nonetheless, such widespread availability 
of mobile health apps also brings unprece- 
dented power to everyday people in their abil- 
ity to collect new forms of data about their 
own health and to interpret that data indepen- 
dently from clinicians. For example, HemaApp 
allows people with anemia or pulmonary ill- 
nesses to detect and monitor total hemoglo- 
bin in their blood using a smartphone’s 
camera and flash, rather than requiring a visit 
to a clinician’s office for a blood draw and fol- 
low up visit about their results (Wang et al. 
2016). Many other apps, such as BiliScreen for 
jaundice detection and pancreatic cancer 
screening (Mariakakis et al. 2017) and 
SpiroSmart for assessing lung function 
(Larson et al. 2012), allow people to detect 
health problems or monitor existing problems 
that previously required a clinician visit. 


19.3.4 Health Disparities 


Despite the increasing uptake of mobile 
phones by people—regardless of race, age, or 
socioeconomic status—and the high preva- 
lence of smartphone usage (81% of all U.S. 
adults),°° concerns still remain about whether 
increased reliance upon such mobile technol- 
ogies will exacerbate existing health dispari- 
ties. A recent systematic review examined the 
research literature to investigate mHealth 
interventions for vulnerable populations 


29 Siek, K. Why fitness trackers may not give you all 
the ‘credit’ you hoped for. January 15, 2020. The 
Conversation. » https://theconversation.com/why- 
fitness-trackers-may-not-give-you-all-the-credit- 
you-hoped-for-128585 Retrieved January 15, 2020. 

30 Pew Research Center. June 12, 2019. Mobile Fact 
Sheet. 1-6. > http://www.pewinternet.org/factsheet/ 


mobile/ Retrieval January 15, 2020. 
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(Stowell et al. 2018). They identified familiar- 
ity with the technology, use of engaging mul- 
timedia content, frequent delivery of content, 
and personalization as facilitators in success- 
ful interventions. However, costs and con- 
cerns about confidentiality and privacy 
(particularly for those who hadn’t completed 
their immigration paperwork) were substan- 
tial barriers. Although a slight majority of 
the reviewed studies showed significant 
improvements in the evaluated health mea- 
sures, their meta-analysis failed to show that 
the mHealth interventions successfully 
impacted health outcomes in vulnerable pop- 
ulations. Researchers are beginning to work 
together to identify opportunities for socio- 
technical interventions to reduce health dis- 
parities (Siek et al. 2019), but much work 
remains. 


19.3.5 Regulatory Issues 


The staggering number of mHealth apps that 
are available in the marketplace makes it chal- 
lenging for clinicians, patients, and the general 
public to discern which apps are effective and 
safe to adopt. Many worry about the prolif- 
eration of these mHealth apps, particularly 
the wide variation in quality, potential mis- 
leading or unsubstantiated claims, and the 
vulnerability of disclosure of personal health 
information. The US Food and Drug 
Administration (FDA) is the regulatory body 
that could provide such safety and effective- 
ness oversight of mHealth apps, but its role 
and influence has been changing. In 2013, 
2015, and 2019, the FDA revised its guidance 
about what it will regulate in the mHealth 
space and is now focusing only on apps that 
pose a great risk if they do not work as intend- 
ed.*! The regulations pertain to apps that 
diagnose, treat, or prevent a health condition. 
To help developers determine what laws and 


31 FDA. Device Software Functions Including Mobile 
Medical Applications. > https://www.fda.gov/medi- 
cal-devices/digital-health/device-software-func- 
tions-including-mobile-medical-applications 
Retrieval January 15, 2020. 
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regulations apply to their mHealth apps, the 
FDA created an interactive tool that poses 
questions and summarizes the applicability of 
the laws based on the answers.*” In the private 
sector, Xcertia serves as an mHealth app col- 
laborative effort of the American Medical 
Association (AMA), the American Heart 
Association (AHA), DHX Group and the 
Healthcare Information and Management 
Systems Society (HIMSS) “to foster safe, 
effective, and reputable health technologies.” >? 
They have developed a set of guidelines for 
assessing operability, privacy, security, con- 
tent, and usability. Clearly, the safety and reg- 
ulation of mHealth apps will remain a key 
issue for the future. 


19.4 Future Directions 


The rapidly changing nature of technology 
makes writing a book chapter on any aspect 
of it challenging, but anticipating future 
directions is even more daunting. All aspects 
of this book face that challenge, but the field 
of mHealth and the applications that charac- 
terize it are especially dynamic. Exact statis- 
tics are hard to find and rapidly out of date, 
but nonetheless paint a picture of mHealth’s 
growing influence. A 2017 report from IQIA 
reports that over 318,000 mHealth apps were 
available worldwide and that more than 200 
health apps are added each day.** According 
to App Annie’s State of Mobile 2019 Report, 
the global download of mHealth apps 
exceeded 400 million in 2018 with the growth 


32 Developing a mobile health app? Find out which 
federal laws you need to follow. > https://www.ftc. 
gov/tips-advice/business-center/guidance/mobile- 
health-apps-interactive-tool Retrieval January 15, 
2020 

33 Xcertia: mHealth App Guidelines. » https://xcertia. 
org/ Retrieved January 15, 2020. 

34 The Growing Value of Digital Health: Evidence and 
Impact on Human Health and the Healthcare Sys- 
tem. Nov 07, 2017. » https://www.iqvia.com/ 
insights/the-iqvia-institute/reports/the-growing- 
value-of-digital-health Retrieved January 15, 2020. 


coming from many different countries. 
Grand View Research predicts that the global 
market for mHealth apps will reach 236 bil- 
lion in US dollars by 2026 with fitness as the 
largest type of mHealth app.*° 

Some see the future of mHealth through 
the eyes of science fiction. In particular, many 
have envisioned Star Trek’s® tricorder-like 
technology of a small, hand-held device that 
could quickly diagnose a variety of medical 
conditions. Qualcomm incentivized this vision 
with its $10 million XPRIZE competition to 
develop a mHealth device that could accu- 
rately diagnose 13 medical conditions, capture 
5 real-time health vital signs, and provide a 
compelling consumer experience, without 
input from a healthcare professional or facili- 
ty.” Although no one was able to meet all 
their criteria, Final Frontier Medical Devices 
(now Basil Leaf Technologies”) received the 
top prize of $2.5 million with their DxtER 
device that employed non-invasive sensors to 
collect vital signs, body chemistry, and bio- 
logical functions.” Many others have alterna- 
tive, grand views of the future of mHealth 
apps. One point is clear: mHealth applications 
will continue to influence all aspects of health- 
care—from wellness and prevention through 


35 Sydow, L. Medical Apps Transform How Patients 
Receive Medical Care. April 16, 2019. > https:// 
www.appannie.com/en/insights/market-data/medi- 
cal-apps-transform-patient-care/ Retrieved January 
15, 2020. 

36 Grand View Research. mHealth Apps Market Size, 
Share & Trends Analysis Report By Type (Fitness, 
Lifestyle Management, Nutrition & Diet, Women’s 
Health, Medication Adherence, Healthcare Provid- 
ers/Payers), And Segment Forecasts, 2019-2026. 
June 2019 » https://www.grandviewresearch.com/ 
industry-analysis/mhealth-app-market Retrieved 
January 15, 2020 

37 Qualcomm Tricorder XPRIZE: Empowering Per- 
sonal Healthcare. » https://www.xprize.org/prizes/ 
tricorder Retrieved January 15, 2020. 

38 Basil Leaf Technologies. » http://www.basilleaf- 
tech.com/ Retrieved January 15, 2020. 

39 Family-led team takes top prize in qualcomm tri- 
corder xprize competition for consumer medical 
device inspired by Star Trek® April 13, 2017. 


> https://www.xprize.org/prizes/tricorder/articles/fam- 
ily-led-team-takes-top-prize-in-qualcomm-tricor 
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patient-provider interaction and even sur- 
gery—and that influence will likely follow an 
unpredictable but pivotal path. 
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interventions for vulnerable populations: A 
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ing systems. Paper (Vol. 15, pp. 1-17). 


Q Questions for Discussion 

= What challenges does the rapid evolution of 
mobile technologies and platforms (OS) 
pose to the people who develop mHealth 
interventions? 

= What are some of the ways in which mHealth 
technologies are designed for health 
professionals? In designing such tools, what 
are some key design considerations? 

= How can lay individuals access their own 
data collected from mHealth technologies? 
In what cases such data access might be 
useful for individuals? 

= In addition to creating new ways to deliver 
health interventions, mHealth has been seen 
as having the potential to greatly advance 
health research, including fields such as 
epidemiology. How can mHealth tools 
advance our understanding of factors that 
shape individuals’ health? 


661 


= What are just-in-time adaptive interventions 
and how do mHealth tools enable this type 
of health intervention? 

= Most patient-centered mHealth tools are 
discretionary use technologies, in the sense 
that individuals can choose whether, how 
much, and for how long to use these 
devices and applications. Yet, for these 
tools to have a hope of being effective, 
they have to be used. Given what learned 
in this chapter, what aspects of mHealth 
tools can facilitate and hinder engagement? 
How can mHealth designers make these 
tools more engaging so individuals can 
benefit from them? 
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(e) Learning Objectives 

After reading this chapter you should know 

the answers to these questions: 

= What are the key informatics require- 
ments for successful implementation of 
telehealth systems? 

= What are some key benefits from and 
barriers to implementation of tele- 
health systems? 

= What are the most promising emerging 
application domains for telehealth? 


20.1 Introduction 


Complexity and collaboration characterize 
health care in the early twenty-first century. 
Complexity arises from increasing sophis- 
tication in the understanding of health and 
disease, wherein etiological models must 
acknowledge both molecular processes and 
physical environments. Collaboration reflects 
not only inter-professional collaboration, but 
also a realization that successful attainment of 
optimal well-being and effective management 
of disease processes necessitate active engage- 
ment of clinicians, lay persons, family mem- 
bers, communities, and society as a whole. 
This chapter introduces the concepts of tele- 
medicine and telehealth, and illustrates how 
advanced networks make possible the collab- 
orations necessary to achieve the full benefits 
of our growing understanding of health pro- 
motion, disease management and rehabilita- 
tion. Consider the following situation: 
Samuel is a 76-year-old man with coronary 
artery disease, poorly-controlled Type II dia- 
betes, and high blood pressure. He lives alone 
in a rural area and does not drive. His daugh- 
ter lives further away but visits occasion- 
ally. One of his neighbors visits regularly to 
check on him and assist with various errands. 
In the past, Samuel has been unable to keep 
medical appointments consistently because 
of difficulty arranging transportation. He 
had a recent acute hyperglycemic episode 
that required hospitalization. After 4 days he 
is medically stable and ready for discharge. 
He is able to measure his blood glucose and 
can safely administer the appropriate dose of 


insulin. The nurse notes that Samuel some- 
times has trouble calibrating his insulin dose 
to the blood glucose reading. 


Telemedicine and Telehealth 
to Reduce the Distance 
Between the Consumer 

and the Health Care System 


20.1.1 


Historically, health care has usually involved 
travel. Either the health care provider trav- 
eled to visit the patient, or more recently, the 
patient traveled to visit the provider. Patients 
with diabetes, like Samuel whom we will be 
discussing later and throughout the chap- 
ter, typically meet with their physician every 
2-6 months to review data and plan therapy 
changes. Travel has costs, both directly, in 
terms of gasoline or transportation tickets, 
and indirectly, in terms of travel time, delayed 
treatment, and lost productivity. In fact, travel 
has accounted for a significant proportion 
of the total cost of health care (Starr 1982). 
Because of this, both patients and providers 
have been quick to recognize that rapid elec- 
tronic communications have the potential to 
improve care by reducing the costs and delays 
associated with travel. This has involved both 
access to information resources, as well as 
communication among various participants, 
including patients, family members, primary 
care providers and specialists whether it is 
synchronous communication (where all stake- 
holders interact at the same time) or asyn- 
chronous (where information is exchanged 
with a time lag). 

As is the case with informatics, the formal 
definitions of telemedicine and telehealth tend 
to be very broad. Telemedicine involves the use 
of modern information technology, especially 
two-way interactive audio/video communi- 
cations, computers and telemetry to deliver 
health services to remote patients and to facil- 
itate information exchange between primary 
care physicians and specialists at some dis- 
tance from each other (Bashshur et al. 2009). 
Telehealth is a somewhat newer and broader 
term referring to remote health care that 
includes clinical and social services provided 
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using telemedicine, as well as interactions with 
automated systems or information resources. 
Because of its broader scope, we are using the 
term telehealth in this chapter. 

As is the case with biomedical informat- 
ics, there are many different sub-domains 
within telehealth. For nearly every clinical 
domain, there is a “tele-X” or “X telehealth”, 
where X is the clinical domain. Examples 
include: Teleradiology (see » Sect. 20.3.4); 
Teleophthalmology (see » Sect. 20.3.4); 
Telepsychiatry (see » Sect. 20.3.5); and, 
home telehealth (see » Sect. 20.3.5). Some 
sub-domains do not fit neatly into this nam- 
ing paradigm. Correctional Telehealth (see 
> Sect. 20.3.5) refers to the location of the 
patient in a prison. It is discussed separately 
because of the unique business model, and the 
fact that it represents an early and sustained 
success. Remote Intensive Care (see » Sect. 
20.3.5) is the term used to describe the use 
of telehealth technologies in an ICU setting. 
Teleconsultation is a general term describing 
the use of telehealth technologies to support 
discussions between clinicians, or between a 
clinician and a patient. The archetypal tele- 
consultation occurs when the patient and the 
generalist clinician are in a rural or remote 
location and a specialist is at a distant tertiary 
referral facility. Telepresence (see » Sect. 
20.3.6) refers to high-speed, multi-modality 
telehealth interactions, such as Telesurgery, 
that gives the feeling of “being there”. In this 
chapter we will review how some of these 
sub-domains may play a role in supporting 
Samuel manage their health care needs more 
effectively. 

It is clear from the definition above that 
there is considerable overlap between tele- 
health and biomedical informatics. In fact, 
one will frequently find papers on telehealth 
systems presented at biomedical informatics 
conferences and presentations on informatics 
at telehealth and telemedicine meetings. Some 
groups, especially in Europe, have adopted the 
rubric health information and communication 
technology (HICT). The major distinction is 
one of emphasis. Telehealth and telemedicine 
emphasize the notion of distance, especially 
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the provision of care to remote or isolated 
patients and communities. In contrast, bio- 
medical informatics emphasizes methods for 
handling the information moving between 
the participants, irrespective of the distance 
between patient and provider. 

Consumer health informatics (CHI), also 
called personal health informatics (PHI), is 
a related domain that bridges the distance 
between patients and health care resources, 
and that typically emphasizes interactions with 
computer-based information such as websites 
or information resources. Collectively, CHI 
and telehealth deliver health care knowledge 
and expertise to where they are needed, and 
are ways to involve the patient as an active 
partner in care. Despite their similarities, CHI 
and telehealth come from very different his- 
torical foundations. Telehealth derived from 
traditional patient care, while CHI derived 
from the self-help movements of the 1970’s. 
Largely owing to this historical separation, 
practitioners and researchers in the two fields 
tend to come from different backgrounds. 
For these reasons, we are presenting CHI and 
telehealth as two distinct, but closely related 
domains (see ® Chap. 11 for more informa- 
tion on personal health informatics). 


20.2 Historical Perspectives 


The use of communication technology to con- 
vey health-related information at a distance is 
nothing new. The earliest known example may 
be the use of so-called “leper bells” carried by 
individuals during Roman times. Sailing ships 
would fly a yellow flag to indicate a ship was 
under quarantine and awaiting clearance by 
a doctor, or a yellow and black “plague flag” 
to indicate that infected individuals were on 
board. By some accounts, when Alexander 
Graham Bell said “Mr. Watson. Come here. 
I need you” in 1876, it was because he had 
spilled acid on his hand and needed medical 
assistance. In 1879, only 3 years later, the first 
description of telephone use for clinical diag- 
nosis appeared in a medical journal (Practice 
by Telephone 1879). 
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20.2.1 Early Experiences 


One of the earliest and most long-lived tele- 
health projects is the Australian Royal Flying 
Doctor Service (RFDS), founded in 1928 
and continuing to this day. In addition to 
providing air ambulance services, the RFDS 
provides telehealth consultations. These con- 
sultations first used Morse Code, and later 
voice, leveraging radio communications to 
the remote sheep stations in the Australian 
outback. Lay people played a significant role 
here, clearly communicating their concerns 
and clinical findings to the RFDS and care- 
fully carrying out instructions while awaiting, 
if necessary, the arrival of the physician. The 
RFDS is most famous for its standardized 
medical supply chest, introduced in 1942. The 
chest contains diagnostic charts and medica- 
tions, identified only by number. This allowed 
the consulting clinician to localize symptoms 
by number and then prescribe care, such as 
“take one number five and two number fours.” 
Modern telehealth can be traced to 1948 
when the first transmission of a radiograph 
over a phone line was reported. Video-based 
telehealth can be traced to 1955 when the 
Nebraska Psychiatric Institute began experi- 
menting with a closed-circuit video network 
on its campus. In 1964 this was extended to 
a remote state mental health facility to sup- 
port education and teleconsultation. In 1967, 
Massachusetts General Hospital (MGH) 
was linked to Logan International Airport 
via a microwave audio-video link (Bird 1972; 
Murphy et al. 1973). In 1971 the National 
Library of Medicine began the Alaska 
Satellite Biomedical Demonstration project 
linking 26 remote Alaskan villages utilizing 
NASA satellites (Hudson and Parker 1973). 
The period from the mid 1970’s to the late 
1980’s was a time of much experimentation, 
but few fundamental changes in telehealth. A 
variety of pilot projects demonstrated the fea- 
sibility and utility of video-based telehealth. 
The military funded a number of research 
projects aimed at developing tools for pro- 
viding telehealth care on the battlefield. The 
early 1990’s saw several important advances. 


Military applications developed during the 
previous decades began to be deployed. 
Military teleradiology was first deployed 
in 1991 during Operation Desert Storm. 
Telehealth in military field hospitals was first 
deployed in 1993 in Bosnia. Several states, 
including Georgia, Kansas, North Carolina 
and Iowa implemented statewide telehealth 
networks. Some of these were pure video net- 
works, based on broadcast television technol- 
ogy. Others were built using evolving Internet 
technology. During this same period, correc- 
tional telehealth (see » Sect. 20.3.5) became 
much more common. For example, in 1992 
East Carolina University contracted with the 
largest maximum-security prison in North 
Carolina to provide telehealth consultation. 

Telehealth projects in the early 1990s con- 
tinued to be plagued by two problems that 
had hampered telehealth since its inception: 
high cost and poor image quality. Both hard- 
ware and high-bandwidth connections were 
prohibitively expensive. A single telehealth 
station typically cost over $50,000 and con- 
nectivity could cost thousands of dollars per 
month. Most programs were dependent on 
external grant funding for survival. Even with 
this, image resolution was frequently poor 
and motion artifacts were severe. 

The Internet revolution that began in the 
late 1990s drove fundamental change in tele- 
health. Advances in computing power both 
improved image quality and reduced hard- 
ware costs to the point that, by 2000, compa- 
rable systems cost less than a tenth of what 
they had a decade earlier. Improvements in 
image compression made it possible to trans- 
mit low-resolution, full-motion video over 
standard telephone lines, enabling the growth 
of telehome care. With the increasing popular- 
ity of the World Wide Web, high-bandwidth 
connections became both more available 
and less expensive. Many telehealth applica- 
tions that had relied on expensive, dedicated, 
point-to-point connections were converted to 
utilize commodity Internet connections. The 
availability of affordable hardware and con- 
nectivity also made access to health-related 
electronic resources from the home, school 
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or work place possible and fueled the growth 
of consumer health information. In 2020, the 
COVID-19 pandemic highlighted the poten- 
tial of telehealth in facilitating essential health 
care services and led to an expedited adoption 
of telehealth services across many health sys- 
tems worldwide (Jain et al. 2020). 


20.2.2 Recent Advances 
in Medical-Grade Broadband 
Technology 


As telemedicine applications are being 
increasingly used in critical medical situa- 
tions such as emergency care and remote sur- 
gery applications, quality of service (QOS) 
becomes extremely important. It is impor- 
tant to note that optimally provisioning a 
network for medical-grade QOS does not 
simply imply that the network will provide 
“quality” in the sense of reliability, consis- 
tency and bandwidth performance, although 
these characteristics are certainly important 
requirements. Any network, no matter the 
bandwidth available, can become congested — 
overwhelmed with the volume of traffic to the 
extent that sessions are interrupted and data 
lost. Bandwidth availability limitations are 
particularly prevalent in rural locations where 
high-capacity circuits may be unavailable or 
prohibitively expensive. 

Newer network routing technologies such 
as multiprotocol label switching (MPLS) can, 
in addition to providing superior network 
throughput performance, permit explicit 
prioritization of clinical traffic while simul- 
taneously providing access to lower priority 
administrative and other non-clinical traffic. 
The individual data packets of high priority 
traffic (e.g., telehealth or patient monitoring 
sessions) are “tagged” with a numerical prior- 
ity flag. As the QOS-tagged packets traverse 
the network, each routing/switching device 
recognizes the priority tag and preferentially 
processes and forwards the packets. This 
explicit QOS combined with advanced secu- 
rity and privacy features within a broadband 
network has been characterized as “Medical 
Grade” broadband. 
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20.3 Bridging Distance 
with Informatics: 
Real-World Systems 


There are many ways to categorize telehealth 
resources, including classifications based on 
participants, bandwidth, information trans- 
mitted, medical specialty, immediacy, health 
care condition, and financial reimbursement. 
The categorization in @ Table 20.1 is based 
loosely on bandwidth and overall complex- 
ity. This categorization was chosen because 
each category presents different challenges for 
informatics researchers and practitioners. 

A second categorization of telehealth sys- 
tems that overlaps the previous one is the sep- 
aration into synchronous (or real-time) and 
asynchronous (or store-and-forward systems). 
Video conferencing is the archetypal synchro- 
nous telehealth application. Synchronous 
telehealth encounters are analogous to con- 
ventional office visits. Telephony, chat-groups, 
and telepresence (see » Sect. 20.3.6) are also 
examples of synchronous telehealth. A major 
challenge in all synchronous telehealth is 
scheduling. All participants must be at the 
necessary equipment at the same time. 

Store-and-forward, as the name implies, 
involves the preparation of a dataset at one 
site that is sent asynchronously to a remote 
recipient. Remote interpretation, especially 
teleradiology, is the archetypal example of 
store-and-forward telehealth. Images are 
obtained at one site and then sent, sometimes 
over very low bandwidth connections, to 
another site where the domain expert inter- 
prets them. Other examples of store-and- 
forward include access to Web sites, e-mail 
and text messaging. Some store-and-forward 
systems support the creation of multimedia 
“cases” that contain multiple clinical data 
types, including text, scanned images, wave 
forms and videos. 


20.3.1 The Forgotten Telephone 


Until recently, the telephone was a forgotten 
component in telehealth. The field of tele- 
medicine and telehealth focused on video 
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O Table 20.1 Categories of telehealth and consumer health informatics 


Web-based information resources, patient access to electronic medical 


E-mail, chat groups, consumer health networks, personal clinical 


electronic communications (PCEC) 


Remote monitoring of pacemakers, diabetes, asthma, hypertension, 
Congestive Health Failure (CHF). 


PACS, remote interpretation of radiographic studies and other 
images, such as dermatologic and retinal photographs. 


Wide range of applications, from telehome care to telementoring and 


Telehealth category Bandwidth Applications 
Information Low to 
resources moderate records 
Messaging Low 
Telephone Low Scheduling, triage 
Remote monitoring Low to 

moderate 
Remote Moderate 
interpretation 
Videoconferencing Low to high 

telepsychiatry 

Telepresence High Remote surgery, telerobotics 


and largely ignored the audio-only telehealth. 
This is paradoxical given that up to 25% of 
all primary care encounters occur via the 
telephone. These include triage, case manage- 
ment, results review, consultation, medication 
adjustment and logistical issues, like schedul- 
ing. In part, this can be traced to the fact that 
telephone consultations are not reimbursed 
by most insurance carriers. 

More recently, increased interest in cost 
control through case management has driven 
renewed interest in use of audio-only com- 
munication between patients and providers. 
Multiple articles have appeared on the value 
of telephone follow-up for chronic conditions 
(Downes et al. 2017; Jayakody et al. 2016). 
Several managed care companies have set up 
large telephone triage centers. The National 
Health Service in the UK is investing £123 mil- 
lion per year in NHS Direct, a nation-wide 
telephone information and triage system that 
handles 27,000 calls per day. 


20.3.2 Electronic Messaging 


Electronic text-based messaging has emerged 
asa popular mode of communication between 
patients and providers. It began with patients 
sending conventional e-mails to physicians. 
The popularity of this grew so rapidly that 


national guidelines were developed (Kane and 
Sands 1998). However, e-mail has a number of 
disadvantages for health-related messaging: 
delivery is not guaranteed; privacy and secu- 
rity are problematic; e-mail is transient (there 
was no automatic logging or audit trail); and 
the messages are completely unstructured. 

To address these limitations, a variety of 
Web-based messaging solutions, called per- 
sonal clinical electronic communications, have 
been developed (Sarkar and Starren 2002). 
Because the messages never leave the Web site, 
many of the problems associated with conven- 
tional e-mail are avoided. Web-based messag- 
ing is a standard feature of patient portals see 
> Chap. 11) associated with many EHRs. The 
inclusion of messaging as a Meaningful Use 
requirement for Certified EHRs significantly 
increased the use of web-based messaging to 
provide telehealth (@ Fig. 20.1). 


20.3.3 Remote Monitoring 


Remote monitoring is a subset of telehealth 
focusing on the capture of clinically relevant 
data in the patients’ homes or other locations 
outside of conventional hospitals, clinics 
or health care provider offices, and the sub- 
sequent transmission of the data to central 
locations for review. The conceptual model 
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ways that electronic communications can be used to link 
patients with various health resources. Only connections 
directly involving the patient are shown (e.g., use of the 
EHR by the clinician is not shown). Patient generated 
data are created via passive monitoring (such as home- 


underlying nearly all remote monitoring is 
that clinically significant changes in patient 
condition occur between regularly scheduled 
visits and that these changes can be detected 
by measuring physiologic parameters. 

The care model presumes that, if these 
changes are detected and treated sooner, 
the overall condition of the patient will be 
improved. An important distinction between 
remote monitoring and many conventional 
forms of telemedicine is that remote moni- 
toring focuses on management, rather than 
on diagnosis. Typically, remote monitoring 
involves patients who have already been diag- 
nosed with a chronic disease or condition. 
Remote monitoring is used to track parame- 


or other active monitoring devices) in the patient’s home 
or other community settings. Other resources, such as 
remote surgery or imaging, would require the patient to 
go to a telehealth-equipped clinical facility 


ters that guide management. Any measurable 
parameter is a candidate for remote monitor- 
ing. The collected data may include continu- 
ous data streams or, more commonly, discrete 
measurements. 

Another important feature of most remote 
monitoring is that the measurement of the 
parameter and the transmission of the data 
are typically separate events. The measure- 
ment devices have a memory that can store 
multiple measurements. The patient will send 
the data to the caregiver in one of several 
ways. For many studies, the patient will log 
onto a server at the central site (either over the 
Web or by direct dial-up) and then type in the 
data. Alternately, the patient may connect the 
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measurement device to a personal computer 
or specialized modem and transfer the read- 
ings electronically. 

More recently, a variety of monitoring 
devices have been developed that either con- 
nect directly to mobile telephones or transmit 
the data to the mobile phone using Bluetooth 
wireless. The mobile phone then transmits the 
data to a provider for review. A major advan- 
tage of direct electronic transfer is that it elim- 
inates problems stemming from manual entry, 
including falsification, number preference 
and transcription errors. The role of mobile 
telephones in providing health services has 
grown so rapidly that the term mobile health 
(or “mHealth”) has been coined. The term 
appeared first, one time, in 2004 in PubMed. 
See > Chap. 19 for more on mHealth sys- 
tems. Additionally, the emergence of inter- 
connected sensors and devices referred to as 
Internet of Things (IoT) described later in the 
chapter, have the potential to contribute to 
remote monitoring systems. 

Any condition that is evaluated by mea- 
suring a physiologic parameter is a candidate 
for remote monitoring. The parameter most 
measured in the remote setting is blood glu- 
cose for monitoring diabetes. A wide variety 
of research projects and commercial systems 
have been developed to monitor patients with 
diabetes. Patients with asthma can be moni- 
tored with peak-flow or full-loop spirometers. 
Patients with hypertension can be monitored 
with automated blood pressure cuffs. Patients 
with congestive heart failure (CHF) are moni- 
tored by measuring daily weights to detect 
fluid gain. Remote monitoring of pacemaker 
function has been available for a number of 
years and has recently been approved for 
reimbursement. Home coagulation meters 
have been developed that allows the monitor- 
ing of patients on chronic anticoagulation 
therapy. See a discussion on remote intensive 
care later in the chapter and also > Chap. 21 
for more on patient monitoring systems. 

Several factors limit the widespread use 
of remote monitoring. First is the question 
of efficacy. While these systems have proven 
acceptable to patients and beneficial in small 
studies, few large-scale controlled trials have 
been done. Second is the basic question of 


who will review the data. Research studies 
have utilized specially trained nurses at cen- 
tralized offices, but it is not clear that this will 
scale up. Third is money—for most condi- 
tions, remote monitoring is still not a reim- 
bursed activity. 


20.3.4 Remote Interpretation 


Although Samuel was diagnosed with Type 
II diabetes over 20 years ago and realizes that 
visual loss can be a serious complication, he 
has only rarely received dilated eye exams for 
retinopathy screening. There is no eye doc- 
tor conveniently located near his home, and 
he feels that the appointments are always 
too long and that he has no problems such 
as blurred vision. However, his primary care 
doctor has recently implemented a new reti- 
nal screening machine in the office. During a 
routine medical examination, Samuel receives 
a retinal photograph from an office technician 
that is then interpreted by a remote ophthal- 
mologist. Samuel is told that he has high-risk 
diabetic retinopathy that requires treatment to 
prevent visual loss. He is emergently referred 
to an ophthalmologist, who performs a 
successful laser procedure to treat the diabetic 
retinopathy. 

Remote interpretation is a category of 
store-and-forward telehealth that involves 
the capture of images, or other data, at one 
site and their transmission to another site for 
interpretation. This may include radiographs 
(teleradiology), photographs (teledermatol- 
ogy, teleophthalmology, telepathology), wave 
forms such as ECGs (e.g. telecardiology), and 
text-based medical data. 

The store-and-forward telehealth modali- 
ties have benefited most from the develop- 
ment of the commodity Internet and the 
increasing availability of affordable high 
bandwidth connections that it provides. The 
shared commodity Internet provides relatively 
high bandwidth, but the available bandwidth 
is continuously varying. This makes it much 
better suited for the transfer of text-based 
data and image files, rather than for streaming 
data or video connections. Although image 
files are often tens or hundreds of megabytes 
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in size, the files are typically transferred to the 
interpretation site and cached there for later 
interpretation. From a logistical perspec- 
tive, multiple remote interpretations may be 
batched and performed together, thereby pro- 
viding important workflow and convenience 
advantages over traditional medical examina- 
tions or real-time video telehealth paradigms. 


= Teleradiology 

By far, teleradiology is the largest category 
of remote interpretation, and probably the 
largest category of telehealth. Teleradiology 
(along with telepathology) represents the 
most mature clinical domain in telehealth. 
With the deployment of picture archiving and 
communications systems (PACS) that capture, 
store, transmit and displays digital radiology 
images, the line between teleradiology and 
conventional radiology is blurring. In fact, 
routine medical care in radiology and pathol- 
ogy is increasingly being delivered primarily 
through “telehealth” strategies (Radiology 
image management is discussed in more detail 
in > Chap. 22). 

Many factors have contributed to the more 
rapid adoption of telehealth in domains such 
as radiology and pathology. One important 
factor is the relationship between these spe- 
cialists and their patients. In both domains, 
the professional role is often limited to the 
interpretation of images, and the special- 
ist rarely interacts directly with the patient. 
To patients, there is therefore little perceived 
difference between a radiologist in the next 
building and one in the next state. 

An important factor driving the growth 
of teleradiology is that it is reimbursable by 
insurance payers. Because image interpre- 
tation does not involve direct patient con- 
tact, few payers make any distinction about 
where the interpretation occurred. Rapid dis- 
semination of teleradiology systems has also 
been supported by widespread adoption of 
vendor-neutral image storage and transmis- 
sion standards such as Digital Imaging and 
Communication in Medicine (DICOM; dis- 
cussed in more detail in > Chaps. 7 and 22). 
Finally, numerous evaluation studies have 
demonstrated that digital image interpreta- 
tion by through teleradiology has compara- 
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ble, or potentially even better, accuracy and 
efficiency compared to traditional film-based 
radiological examination (Franken et al. 1992; 
Mackinnon et al. 2008; Reiner et al. 2002). 


= Teleophthalmology 

Another area of remote interpretation that 
is growing rapidly is teleophthalmology, 
particularly for retinal disease screening. As 
one example, diabetic retinopathy (retinal 
disease) is a leading cause of blindness that 
can be treated if detected early. However, it 
has been found that nearly 50% of diabetics 
are non-compliant with guidelines recom- 
mending annual screening eye examinations 
(Brechner et al. 1993). Systems have been 
developed that allow nurses or technicians 
in primary care offices to obtain high quality 
digital retinal photographs. These images are 
sent to regional centers for interpretation. If 
diabetic retinopathy is identified or suspected, 
the patient is referred for full ophthalmologic 
examination. 

Large-scale operational systems have 
been implemented by the Veterans Health 
Administration and by other institutions, par- 
ticularly in areas with limited accessibility to 
eye care specialists (Cavalleranno et al. 2005; 
Cuadros and Bresnick 2009). In fact, remote 
interpretation of retinal images by certified 
reading centers, when taken after dilation of 
the eyes using standard photographic proto- 
cols originally developed for clinical research 
trials, has been demonstrated to classify dia- 
betic retinopathy more accurately than tradi- 
tional dilated eye examination. This is likely 
because retinal abnormalities found on pho- 
tographs may be reviewed in more detail than 
what is generally feasible during traditional 
eye examinations. 

Another application of teleophthalmology 
is in retinopathy of prematurity (ROP), a lead- 
ing cause of blindness in premature infants, 
whereby hospitalized infants are examined 
regularly to identify treatment-requiring dis- 
ease. However, these examinations are logis- 
tically difficult and time consuming, and the 
number of ophthalmologists willing to per- 
form them has decreased. As a result, systems 
have been developed in which trained nurses 
capture retinal photographs and transmit 
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O Fig. 20.2 Retinal images of diabetic retinopathy 
obtained via a fundus on phone (FOP) smartphone sys- 
tem compared to a professional Zeiss camera tradition- 


them to experts for remote interpretation 
(Richter et al. 2009). The proliferation of 
smartphones has introduced additional ways 
to promote teleopthamology, using a “fun- 
dus on phone” (FOP) camera to facilitate a 
smartphone-based cost-effective retinal imag- 
ing system (B Fig. 20.2). 


20.3.5 Video-Based Telehealth 


To many people telehealth is videoconferenc- 
ing. Whenever the words “telehealth” or “tele- 
medicine” are mentioned, most people have a 
mental image of a patient talking to a doctor 
over some type of synchronous video connec- 
tion. Indeed, most early telehealth research 
did focus on synchronous video connections. 
For many of the early studies, the goal was to 
provide access to specialists in remote or rural 
areas. Nearly all of the early systems utilized 
a hub-and-spoke topology where one hub, 
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ally used by optometrists. (Source: Rajalakshmi et al. 
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usually an academic medical center, was con- 
nected to many spokes, usually rural clinics. 

Many of the early telehealth consults 
involved the patient and the primary care pro- 
vider at one site conferring with a specialist at 
another site. Most of the state-wide telehealth 
networks operated on this model. This was so 
engrained in the telehealth culture, that the 
first legislation allowing Medicare reimburse- 
ment of telehealth consults required a “pre- 
senter” at the remote site. 

This requirement for a “presenter” exac- 
erbated the scheduling problem. Because 
synchronous video telehealth often uses spe- 
cialized videoconferencing rooms, the tele- 
visits need to be scheduled at a specific time. 
Getting the patient and both clinicians (expert 
and presenter) at the right places at the right 
time has forced many telehealth programs 
to hire a full-time scheduler. The schedul- 
ing problem, combined with the advent of 
more user-friendly equipment, ultimately led 
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Medicare to drop the presenter requirement. 
Even so, scheduling is often the single biggest 
obstacle to greater use of synchronous video 
consultations. 

A second obstacle has been the availabil- 
ity of relevant clinical information. Because 
of the inability to interface between various 
EHRs, it was not unusual for staff to print out 
results from the EHR at one site and then to 
fax those to the other site prior to a synchro- 
nous video consultation. 

Unlike store-and-forward telehealth, syn- 
chronous video requires a stable data stream. 
Although video connection can use conven- 
tional phone lines (commonly referred to as 
plain old telephone service, or POTS) that 
provide 64 bits-per-second (64 kbs) transmis- 
sion speed, diagnostic quality video typically 
requires at least 128 Kbs and more commonly 
384 Kbs. In order to guarantee stable data 
rates, synchronous video in clinically criti- 
cal situations still relies heavily on dedicated 
circuits, either Integrated Service Digital 
Network (ISDN) connections or leased lines. 
Within single organizations, or in consulta- 
tive or educational settings, Internet Protocol 
(IP) based video conferencing has become 
the dominant modality. While POTS-based 
telehealth systems were common in 1990s 
and even early 2000s, the diffusion of high- 
speed Internet has led to a much wider 
adoption of IP based videoconferencing. 
The anticipated growth of 5G (fifth genera- 
tion wireless systems) facilitating far higher 
speeds and connections with massive capac- 
ity and low latency for consumer devices, is 
expected to accelerate the use of telehealth 
for a broad spectrum of applications and tar- 
get populations. 

Synchronous video telehealth has been 
used in almost every conceivable situation. In 
addition to traditional consultations, the sys- 
tems have been used to transmit grand rounds 
and other educational presentations. Video 
cameras have been placed in operating rooms 
at hub sites to transmit images of surger- 
ies for educational purposes. Video cameras 
have been placed in emergency departments 
and operating rooms at spoke sites to allow 
experts to “telementor” less experienced phy- 
sicians in the remote location. Video cameras 
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have also been placed in ambulances to pro- 
vide remote triage. 

More recently, the growing popular- 
ity of mobile devices is creating potential 
for new strategies involving real-time video 
communication between patients and health 
care providers. This is especially promising 
because mobile networks are low-cost and 
widely-available for consumers, and because 
they are increasingly accessible even in devel- 
oping countries. However, health informa- 
tion exchange using mobile networks raises 
concerns about privacy, security, and compli- 
ance with Health Insurance Portability and 
Accountability Act (HIPAA). With appropri- 
ate encryption settings, wireless video com- 
munication using mobile device applications 
may be HIPAA-compliant (e.g. FaceTime; 
Apple Computer, Cupertino, CA). There are 
already various commercially available solu- 
tions that allow patients to download smart- 
phone apps to access clinicians. Some of these 
apps use chatbot technology to screen symp- 
toms before matching patients with clinicians 
who can communicate with text, images and 
videos and can e-prescribe to local pharma- 
cies. In the future, these mobile technologies 
may provide additional opportunities for 
increased communication between patients 
and providers. 

Prior to the adoption of IP-based video- 
conferencing, programs that begun with grant 
funding ended soon after the grant funding 
ended. Even after the advent of IP-based con- 
ferencing, many programs continued to strug- 
gle. This was in spite of the fact that Medicare 
had begun reimbursing for synchronous video 
under limited circumstances. The COVID-19 
pandemic introduced short-term policy changes 
and led to an accelerated growth of telehealth 
as we discuss later (see > Sect. 20.3.8). 

Some rural health care providers, such 
as the Marshfield Clinic in Wisconsin, have 
integrated synchronous telehealth into their 
standard care model to provide routine spe- 
cialist services to outlying location. Some 
categories of synchronous video telehealth 
have developed sustainable models: telepsy- 
chiatry, correctional telehealth; home tele- 
health, emergency telehealth, and remote 
intensive care. 
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= Telepsychiatry 

In many ways, psychiatry is the ideal clinical 
domain for synchronous video consultation. 
Diagnosis is based primarily on observing and 
talking to the patient. The interactive nature 
of the dialog means that store-and-forward 
video is rarely adequate. Physical examina- 
tion is relatively unimportant, so that the lack 
of physical contact is not limiting. There are 
very few diagnostic studies or procedures, so 
that interfacing to other clinical systems is 
less important. In addition, state offices of 
mental health deliver a significant fraction of 
psychiatric services, minimizing reimburse- 
ment issues. This is illustrated by two projects. 
In 1995, the South Carolina Department of 
Mental Health established a telepsychiatry 
network to allow a single clinician to provide 
psychiatric services to deaf patients through- 
out the state (Afrin and Critchfield 1997). The 
system allowed clinicians, who had previously 
driven all over the state, to spend more time in 
patient care and less time traveling. 

The system was so successful that it was 
expanded to multiple providers and roughly 
20 sites. The second example comes from 
the New York State Psychiatric Institute 
(NYSPI), which is responsible for providing 
expert consultation to mental health facili- 
ties and prisons throughout the state. As in 
South Carolina, travel time was a significant 
factor in providing this service. To address the 
problem, the NYSPI created a videoconfer- 
ence network among the various state mental 
health centers. The system allows specialists at 
NYSPI in New York City to provide consulta- 
tions in a timelier manner, improving care and 
increasing satisfaction at the remote sites. 


= Correctional Telehealth 

Prisons tend to be located far from major 
metropolitan centers. Consequently, they are 
also located far from the specialists in major 
medical centers. Transporting prisoners to 
medical centers is an expensive proposition, 
typically requiring two officers and a vehicle. 
Depending on the prisoner and the distance, 
costs for a single transfer range from hun- 
dreds to thousands of dollars. Because of the 
high cost of transportation, correctional tele- 


health was economically viable even before 
the advent of newer low cost systems. 

Correctional telehealth also improves 
patient satisfaction. A fact surprising to 
many is that inmates typically do not want to 
leave a correctional institution to seek medi- 
cal care. Many perceive it as stigmatizing to 
navigate a medical facility in prison garb. In 
addition, the social structure of prisons is 
such that any prisoner who leaves for more 
than a day risks losing privileges and social 
standing. Correctional telehealth follows the 
conventional model of providing specialist 
consultation to supplement to on-site primary 
care physicians. This has become increasingly 
important with the rising prevalence of AIDS 
in the prison population. 


a Home Telehealth 

After Samuel misses two scheduled visits, the 
Diabetes Educator calls see what the matter is. 
Samuel explains that it is a 1-h drive from his 
home to the diabetes center, that his daugh- 
ter had trouble taking time off from work to 
drive him, and that he would have difficulty 
leaving his wife home alone because she has 
been ill recently. The Diabetes Educator 
notes that Samuel lives in a rural area and 
is eligible to receive educational services via 
telehealth. She signs Samuel up to receive a 
Home Telehealth Unit and schedules deliv- 
ery. The unit is initially difficult for him to 
use because he is not familiar with computer 
systems. However, after this initial learning 
process, Samuel rarely misses a video educa- 
tion session. At one visit, Samuel complains 
that his daughter who lives further away, is 
always “on his case” about his injections. The 
nurse schedules the next video visit during an 
evening when Samuel’s daughter can join the 
video-call. She also schedules Samuel to have 
a video visit with the dietician. 

Somewhat paradoxically, one of the most 
active areas of telehealth growth is at the 
lowest end of the bandwidth spectrum—tele- 
health activities into patients’ homes. In the 
late 1990s, many believed that home broad- 
band access would soon become ubiquitous 
and a number of vendors abandoned POTS- 
based systems in favor of IP-based video 
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solutions. The broadband revolution was 
slower than expected, especially in rural and 
economically depressed areas most in need 
of home telehealth services. A few research 
projects paid to have broadband or ISDN 
installed in patients’ homes. In response to 
this, the American Telemedicine Association 
released new guidelines for Home Telehealth 
in 2002 in which synchronous video was pro- 
vided over POTS connections. However, more 
recently high speed Internet and wireless net- 
works have significantly expanded coverage in 
the US and abroad leading to a growth of high 
speed Internet based video delivery products. 
In addition to video, home telehealth systems 
typically have data ports for connection of 
various peripheral devices, such as a digital 
stethoscope, glucose meter, blood pressure 
meter, or spirometer or allow for Bluetooth 
connection. 

Home telehealth can be divided into two 
major categories. The first category, often 
called telehome care, is the telehealth equiv- 
alent of home nursing care. It involves fre- 
quent video visits between nurses and, often 
homebound, patients. With the advent of 
prospective payment for home nursing care, 
telehome care is viewed as a way for home 
care agencies to provide care at reduced costs 
and potentially lead to a reduction of rehos- 
pitalization for home care patients with com- 
plex care needs. As with home nursing care, 
telehome care tends to have a finite duration, 
often focused on recovery from a specific dis- 
ease or incident. Several studies have shown 
that telehome care can be especially valu- 
able in the management of patients recently 
discharged from the hospital and can signifi- 
cantly reduce readmission rates. 

The second category of home telehealth 
centers on the management of chronic dis- 
eases. Compared with telehome care, this type 
of home telehealth frequently involves a lon- 
ger duration of care and less frequent inter- 
actions. Video interactions tend to focus on 
patient education, more than on evaluation 
of acute conditions. An important distinction 
between telehome care and disease manage- 
ment telehealth is that interactions in the for- 
mer are initiated and managed by the nurse. 
Measurements, such as blood pressure, are 
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typically collected during the video visit and 
uploaded as part of the video connection. For 
disease management, the system also needs 
to support remote monitoring, patient-initi- 
ated data uploads and, possibly, Web-based 
access to educational or disease management 
resources. 

One of the earliest, and the first large-scale 
project to examine the value of telemedi- 
cine systematically in the home setting, was 
the Informatics for Diabetes Education and 
Telemedicine (IDEATel) project (Starren et al. 
2002). Started in 2000, the IDEATel project 
was an 8-year, $60 million demonstration 
project funded by the Center for Medicare 
and Medicaid Services (CMS) involving 1665 
diabetic Medicare patients in urban and rural 
New York State. In this randomized clini- 
cal trial, half of the patients received Home 
Telemedicine Units (HTU), and half con- 
tinued to receive standard care. In addition 
to video, the HTU allowed patients to inter- 
act in multiple ways with their online charts. 
When patients measured blood pressure or 
fingerstick glucose, the encrypted results 
were transmitted to the Columbia University 
Web-based Clinical Information System 
(WebCIS; Hripcsak et al. 1999) at New York 
Presbyterian Hospital (NYPH). Nurse case 
managers monitored patients by reviewing 
the generated data and potential alerts, and 
providing consultation to patients. 

In 2012 Steveton et al. (2012) published 
findings from one of the largest home tele- 
health randomized clinical trials to date. The 
trial was conducted in the UK and involved 
179 general practices and 3230 people with 
diabetes, chronic obstructive pulmonary 
disease or heart failure who were randomly 
assigned to either usual care or the telehalth 
group that also received a set top box con- 
nected to their television capturing symptom 
questions and educational messages and vari- 
ous peripheral devices such as pulse oxim- 
eters, glucometers and digital weight scale for 
capturing and transmitting vital signs. The 
study demonstrated that home telehealth was 
associated with lower mortality and emer- 
gency department (ED) admission rates. That 
same year findings from another clinical trial 
(Takahashi et al. 2012) in the US revealed 
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different trends. In this study, 205 participants 
were randomly assigned to a telemonitoring 
group (including video, peripheral devices for 
vital signs and symptom reporting) or to usual 
care. No significant differences in hospitaliza- 
tions and ED visits were found between the 
two groups; mortality however was higher in 
the telemonitoring group. This study did focus 
on frail older adults (with an average age over 
80 years) and followed a different design and 
analytic approach. 

The advancement of sensor technologies 
has led to the concept of “smart homes”, 
namely residential settings with embedded 
passive monitoring technologies to facilitate 
monitoring of residents with the goal to maxi- 
mize their well-being and safety (Demiris and 
Hensel 2008). Passive monitoring tools utilize 
sensors to facilitate functional, safety or phys- 
iological monitoring or cognitive support or 
sensory aids, to monitor security or address 
social isolation. Examples include a bed sen- 
sor that detects restlessness at night or sleep 
interruptions, motion sensors that capture 
overall activity levels in the home, sedentary 
behaviors or bathroom visits, door sensors 
that measure time spent inside or number of 
visitors, gait sensors to assess gait characteris- 
tics and changes over time as well as fall risk 
(Liu et al. 2016; Reeder et al. 2013). Insight 
into behavioral health and activity levels along 
with more traditional home telehealth data 
sets such as vital signs and symptom reporting 
provide a more comprehensive assessment of 
one’s well-being (including not only the physi- 
ological but also the physical, social, mental 
and cognitive aspects of wellness (Dawadi 
et al. 2016)) introducing a new era for home 
telehealth. The rise of the Internet of Things 
(IoT), namely the diffusion of networks of 
devices, appliances and sensors that are inter- 
connected and enable different passive moni- 
toring components to exchange data and be 
remotely controlled (Lhotska et al. 2018), 
allows not only for monitoring of behavioral 
data but also for providing tailored responses 
to these observations (for example, adjusting 
lighting if unstable gait is detected or allowing 
a clinician to remotely adjust environmental 
parameters). These concepts are still emerg- 
ing and technical, clinical and ethical impli- 


cations have not been fully examined but it 
is certain that the future of home telehealth 
will encompass new data sets and tools, and 
expanded roles and responsibilities for clini- 
cians, patients and families. 


= Emergency Telemedicine 

Samuel develops slurred speech and weak- 
ness on the right side of his body. His daugh- 
ter, who happens to be with him at the time, 
calls 911. The ambulance crew notifies the 
emergency room that they are in route with 
a possible stroke victim. On arrival, the rural 
emergency department (ED) physician does a 
quick evaluation and connects via telemedi- 
cine with a stroke neurologist at an academic 
health center. The neurologist talks with the 
Samuel and his daughter, and participates 
in the examination with the ED physician. 
Following laboratory work and a CT nega- 
tive for hemorrhage, the ED physician again 
consults with the neurologist who confirms 
the diagnosis of ischemic stroke and institutes 
thrombolytic therapy via pre-arranged pro- 
tocol. Samuel is transferred to the intensive 
care unit for close monitoring of his diabetes, 
hypertension, and evolving stroke. 

“Just in time” consultation in the emer- 
gency setting potentially represents one of the 
most beneficial uses of telehealth. Emergency 
telemedicine has been used in a variety of 
ways and has demonstrated significant ben- 
efits, including in such area as tele-trauma 
care, burn care, and critical care pediatric spe- 
cialists consulting on critically ill or injured 
children (Heath et al. 2009; Ricci et al. 2003; 
Saffle et al. 2009). Telehealth in the emergency 
setting is likely to have the greatest benefit 
when time-limited critical decision making 
by a specialist physician regarding a specific 
intervention is necessary. 

An important and increasingly frequently 
used application demonstrating this is in the 
evaluation and treatment of the stroke patient. 
Best practice management of ischemic stroke 
in appropriate patients now includes the use 
of thrombolytic therapy such as tissue plas- 
minogen activator (tPA), which has been 
shown to have statistically significant clinical 
and financial benefits. Recommendations and 
drug labeling limit the use of intravenous tPA 
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to within 3 h of when the patient was last seen 
as well or had witnessed onset of symptoms. 

This therapy, however, has significant 
complications, particularly in patients with 
hemorrhagic rather than ischemic events — 
requiring urgent specialty consultation, 
along with rapid expert interpretation of 
imaging and laboratory work. Many settings 
lack the specialty expertise to have on-site 
“stroke teams” to accomplish best practice. 
Telemedicine can bring specialty expertise to 
a remote location for emergency evaluation of 
the patient directly, while transmit images and 
laboratory work for immediate interpretation. 

This model of care, first called “telestroke” 
care by Levine and Gorman, has been increas- 
ingly used throughout the country (Levine 
1999). The efficacy of this model, compared 
to traditional telephone consultation, was 
evaluated by Meyer et al. (2008). These inves- 
tigators found that telestroke care resulted in 
more accurate decision making than did tele- 
phone consultation. Based on a comprehen- 
sive review of evidence, the American Heart 
Association and American Stroke Association 
concluded that “evidence supporting the 
equivalence of telestroke to in-person care 
is accumulating (Wechsler et al. 2017)”. In 
their report, they review models of telestroke 
and provide suggestions for standardizing 
and adopting quality measures (highlighting 
among others the responsibility for collecting 
quality data as a core component of the agree- 
ment between telestroke sites and a coordinat- 
ing stroke center or distributed partner), and 
recommendations for licensing, credentialing, 
training and documentation. 


m Remote Intensive Care 

Samuel was admitted to the intensive care 
unit (ICU) in his local hospital with the diag- 
nosis of stroke, diabetes and hypertension. He 
is being treated with thrombolytic therapy. 
During the night, Samuel’s blood pressure 
begins to rise significantly above the recom- 
mended level for patients under treatment 
with thrombolytic therapy. This is quickly 
recognized by a remote tele-ICU team that 
provides coverage for all of the ICU beds in 
Samuel’s rural hospital. This remote intensive 
care team has complete access to Samuel’s 
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electronic health record and bedside monitors 
and they also have video and audio connec- 
tivity into the room. The remote critical care 
team is able to quickly connect to Samuel’s 
room and do a neurological exam with the 
assistance of the on-site nursing team. They 
determine that the exam is unchanged from 
the emergency room. They are able to order 
appropriate medications, recommend more 
frequent neurological checks, and directly fol- 
low his blood pressure response. 

Consultation models in the in-patient set- 
ting using telemedicine in a variety of special- 
ties have been reported. Including intensive 
care where timely consults are often essential 
(Assimacopoulos et al. 2008; Marcin et al. 
2004). Although, these consultation models in 
critical care have shown benefit, a comprehen- 
sive multi-modality model has becomeh more 
common. This is often referred to as tele-ICU, 
and is defined as care provided to critically ill 
patients with at least some of the managing 
physicians and nurses in a remote location. 

Some of the initial work in this area, done 
by Rosenfield and Bresslow in the Sentara 
Health System, demonstrated improved mor- 
tality, reduced lengths of stay and decreased 
costs (Rosenfeld et al. 2000). Remote intensive 
care has grown significantly over time with an 
estimated 10% of all ICU beds in the U.S. 
covered under this model of care, in large part 
due to a shortage of critical care physicians 
Typically, a single “Command Center” can 
cover multiple intensive care units over a large 
geographic region creating significant efficien- 
cies and economies of scale. 

This model of care integrates several of 
the technologies discussed in this book and 
is primarily enabled using electronic health 
records, evidenced based decision support 
tools, connections to bedside monitoring sys- 
tems and audio/video based telemedicine into 
patient rooms. Most commonly, critical care 
health professionals co-manage care from a 
Command Center led by board-certified criti- 
cal care physicians. Protocols and treatments 
reviews for patient management are incorpo- 
rated into the care process using data from 
the monitoring and alert systems that indicate 
when changes in care should take place. The 
goal is to assure adherence to best practice, 
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achieve shorter response times to alarms, 
abnormal laboratory values and more rapid 
initiation of life saving interventions (Lilly 
et al. 2011). 

Published studies have shown mixed results 
in terms of the benefits of tele-ICU. Lilly 
et al. reported that in a single academic medi- 
cal center, implementation of tele-ICU was 
associated with reduced mortality and LOS, 
as well as lower rates of preventable complica- 
tions (Lilly et al. 2011). A recent study com- 
pared inter-hospital transfer rates in hospitals 
with a tele-ICU with transfer rates of facilities 
with no telemedicine program in the Veterans 
Health Administration system (examining 52 
ICUs in 23 acute care facilities) and found 
that ICU telemedicine was associated with 
a decrease in inter-hospital ICU transfers 
(Fortis et al. 2018). Another successful dem- 
onstration of the concept of the tele-ICU 
is the eICU at Emory University. In 2012, 
Emory launched an innovative plan to develop 
a collaborative network supporting intensive 
care units remotely throughout Georgia and 
more recently even partnered with clinicians 
in Australia to ensure 24/7 monitoring by 
experts. In the eICU experts can monitor the 
patient and speak directly to a care provider 
at the patient’s bedside in Atlanta, while also 
talking with the patient and family caregivers. 
The use of specialized cameras, video moni- 
tors, microphones and speakers installed in 
Emory’s ICU rooms, at four of its hospitals 
and one non-Emory hospital connect provid- 
ers throughout the state of Georgia and more 
recently also to care teams in Australia. The 
eICU was found to reduce length of patient 
stay, resulted in fewer readmissions, reduced 
costs while addressing the shortage of inten- 
sivists (Buchman et al. 2017). In a systematic 
review and meta-analysis of studies examin- 
ing outcomes of tele-ICUs, the authors con- 
cluded that the tele-ICU may reduce the ICU 
and hospital mortality and shorten the ICU 
length of stay but have no significant effect in 
hospital length of stay (Chen et al. 2018). This 
analysis also highlighted that further exami- 
nation of the cost-effectiveness of a tele-ICU 
is needed. 


20.3.6 Telepresence 


Telepresence involves systems that allow clini- 
cians to not only view remote situations, but 
also to act on them. The archetypal telepres- 
ence application is telesurgery. The most basic 
surgical telepresence systems simply permit 
two-way audio-video communications, by 
which remote surgeons can observe, teach, 
and collaborate with local surgeons while they 
operate on patients. 

More advanced surgical telepresence 
systems allow procedures to actually be 
performed remotely. Although largely still 
experimental, a trans-Atlantic gall bladder 
operation was demonstrated in 2001 (Kent 
2001). The military has funded considerable 
research in this area in the hope that surgical 
capabilities could be extended to the battle- 
field. Telepresence requires high bandwidth, 
low latency connections. Optimal telesurgery 
requires not only teleoperation of robotic 
surgical instruments, but also accurate force 
feedback (or haptic feedback) that requires 
extremely low network latencies. Accurate 
millisecond force feedback has been histori- 
cally limited to distances under 100 miles. The 
endoscopic gall bladder surgery mentioned 
above is an exception to this general principle 
because that specific procedure relied almost 
exclusively on visual information. It used a 
dedicated and custom configured 10 Mb/s 
fiberoptic network with a 155 ms latency. 

Providing tactile feedback over large dis- 
tances actually requires providing the surgeon 
with simulated feedback while awaiting trans- 
mission of the actual feedback data. Such 
simulation requires massive computing power 
and is an area of active research. Telesurgery 
also require extremely high-reliability connec- 
tions. Loss of a connection is an annoyance 
during a consultation; it can be fatal during a 
surgical procedure. 

Robotic surgery systems have been com- 
mercially available since the early 2000s. 
In these systems, surgical instruments and 
a camera are introduced into the patient 
through small incisions. The surgeon controls 
these instruments remotely, while he or she is 
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viewing a magnified three-dimensional cam- 
era image of the patient’s anatomical struc- 
tures. These systems are currently being used 
in some medical centers for small-incision sur- 
gery, typically performed by surgeons seated 
adjacent to their patients. The increasing 
availability and use of these robotic surgery 
systems creates possibilities for an increasing 
number of telesurgery applications. 

To date, robotically-assisted surgery has 
been most common in fields such as cardio- 
thoracic surgery, gynecology, and urology. 
Potential advantages of remote robotically- 
assisted surgery may include smaller incisions, 
improved anatomic visualization, and finer 
control of surgical instrumentation. Several 
clinical studies comparing robotically-assisted 
surgery with traditional surgery have sug- 
gested that the outcomes are similar (Ficarra 
et al. 2009). However, additional research 
is required to determine the optimal role of 
robot-assisted surgery and its applications to 
telesurgery. 

A novel form of telepresence gives cli- 
nicians the ability not only to see, but also 
to walk around. Since the early 2000s, a 
commercially-available system has combined 
conventional video telehealth with a remotely 
controlled robot (@ Fig. 20.3). It allows clini- 


O Fig. 20.3 Telehealth robot. This is controlled by a 
remote clinician, and includes videoconferencing and 
remote monitoring capabilities. In this example, a spe- 
cialist is connecting with a nurse during a patient trans- 
fer. Image courtesy of InTouch Health, reproduced with 
permission 
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cians literally to make remote video rounds. 
A frequent problem with telehealth systems is 
having the equipment where it is needed. With 
this system, the telehealth equipment is able 
to take itself to wherever it is needed. Remote 
monitoring may also be performed by inter- 
facing digital devices such as stethoscopes 
or imaging systems to the remote-controlled 
robot. These remote-controlled systems are 
most often used by physicians and nurses to 
examine patients in nursing homes or other 
long-term facilities, to improve health care 
access in rural areas, and eaperform post- 
operative examinations. Croghan et al. (2018) 
designed and tested a remotely controlled 
mobile audiovisual drone to access inpa- 
tients in surgical wards based on a lightweight 
device that is freely mobile and “emulates 
human interaction by swiveling and adjusting 
height to patients’ eye-level.” As technologi- 
cal advancements in robotics introduce new 
innovative models of telepresence, identifica- 
tion of relevant outcome measures and rig- 
orous evaluation studies are needed to assess 
both the effectiveness and unintended conse- 
quences of such solutions. 


20.3.7 Delivering Specialty 
Knowledge to a Network 
of Clinical Peers 


Telemedicine is used not only to provide direct 
services to patients but also to facilitate con- 
tinuing education and peer support for cli- 
nicians with the goal to ultimately improve 
health care outcomes. The Project ECHO 
(Extension for Community Healthcare 
Outcomes) was originally established by the 
University of New Mexico as a partnership 
of academic medicine, public health offices 
and community clinics to use videoconferenc- 
ing in order to promote knowledge networks 
and connect clinicians in rural areas with spe- 
cialists in order to study cases of patients with 
unique needs (Arora et al. 2007). The first pro- 
gram focused on Hepatitis C but has since been 
adopted in numerous settings in the United 
States and worldwide, for various chronic 
conditions and populations including among 
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others, care of children with autism (Mazurek 
et al. 2017), geriatric mental health (Fisher 
et al. 2017) and diabetes (Swigert et al. 2014). 
The wide adoption of ECHO aims to utilize 
telehealth to strengthen workforce capacity in 
underserved areas and address health dispari- 
ties. As of 2018 Project ECHO operated more 
than 220 hubs for more than 100 conditions 
or diseases in 31 countries. In a recent study, 
a program informed by the original ECHO 
model called SCAN-ECHO (Specialty Care 
Access Network-Extension for Community 
Healthcare Outcomes) was introduced by the 
Veterans Health Administration (VHA) to 
improve care for patients with liver disease 
in rural and underserved areas where care to 
specialized care can be challenging. The study 
collected 5 years of clinical data from 62,237 
veterans with liver disease in the region. Only 
513 of these veterans had a primary care 
physician who participated in SCAN-ECHO 
where they could discuss their patient cases 
with specialists but they had a 54% higher sur- 
vival rate compared to the rest of their cohort 
even when adjusted for other variables (Su 
et al. 2018). 


20.3.8 The Emergence of Telehealth 
during a Global Pandemic 


As telehealth bridges geographic distance, 
it enables continuity in delivery of services 
even at times when populations may not have 
access to travel or even be restricted by physi- 
cal distancing and quarantine. The COVID-19 
pandemic in 2020 highlighted this potential 
and led to rapidly accelerated growth of tele- 
health. Worldwide health systems quickly 
adopted telehealth solutions. In the US, insur- 
ers expanded coverage to include all telemedi- 
cine and telehealth visit types including home 
visits, and licensure requirements were relaxed 
(Centers for Medicare and Medicaid Services 
2020). The US Department of Health and 
Human Services (2020) waived enforcement 
of HIPAA regulations to allow the use of 
video-conferencing for telemedicine visits 
including the use of widely available video- 
conferencing solutions. In one large health 
system (NYU Langone Health) which was 


at the COVID-19 outbreak epicenter at that 
time, telemedicine visits increased by 135% 
in urgent care and by 4345% increase in non- 
urgent care between March 2 and April 14, 
2020 (Mann et al. 2020). In addition to using 
telehealth for the delivery of traditional health 
care services, this platform also played a role 
for decision making and self-triage in the con- 
text of the pandemic. One such example is a 
telehealth patient portal for self-triage and 
scheduling that was created at the University 
of California San Francisco to enable asymp- 
tomatic patients to report exposure history 
and for symptomatic patients to be triaged 
and paired with appropriate levels of care 
(Judson et al. 2020). The rapid expansion of 
telehealth in times of this pandemic highlights 
the significance of investing in infrastructure 
and training to better prepare health systems 
in times of public health emergencies. 


20.4 Challenges and Future 
Directions 


As telehealth evolves from research novelty 
to being a standard way that health care is 
delivered, many challenges must be overcome. 
Some of these challenges arise because the one 
patient, one doctor model no longer applies. 
Basic questions of identity and trust become 
paramount. At the same time, the shifting 
focus from treating illness to managing health 
and wellness requires that clinicians know not 
only the history of the individuals they treat 
but also information about the social and envi- 
ronmental context within which those individ- 
uals reside. In the diabetes example, knowledge 
of the family history of risk factors, diseases, 
and the appropriate diagnostic and interven- 
tional protocols, aid the clinical staff in pro- 
viding timely and appropriate treatment. 


20.4.1 Challenges to Using 
the Internet for Telehealth 


Applications 


Because of the public, shared nature of the 
Internet, its resources are widely accessible 
by citizens and health care organizations. 
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This public nature also presents challenges 
to the security of data transmitted along the 
Internet. The openness of the Internet leaves 
the transmitted data vulnerable to intercep- 
tion and inappropriate access. In spite of sig- 
nificant improvements in the security of Web 
browsing several areas, including protection 
against viruses, authentication of individuals 
and the security of email, remain problematic. 

Ensuring every citizen access to the Internet 
represents a second important challenge to 
the ability to use it for public health purposes. 
Access to the Internet presently requires com- 
puter equipment that may be out of reach 
for persons with marginal income levels. 
Majority-language literacy and the physical 
capability to type and read present additional 
requirements for effective use of the Internet. 
Preventing inequalities in access to health 
care resources delivered via the Internet will 
require that health care agencies work with 
other social service and educational groups 
to make available the technology necessary to 
capitalize on this electronic environment for 
health care. A 2019 Report by the U.S. Federal 
Communications Commission indicated 
that more than 20 million Americans lack 
advanced broadband Internet access (defined 
as download speeds of at least 25 megabits 
per second with upload speeds of 3 Mbps) 
highlighting that many rural settings depend 
on satellite Internet for access at a higher cost. 
The COVID-19 pandemic intensified the dis- 
parities that emerge from this digital divide 
and strengthened ongoing efforts to dissemi- 
nate higher bandwidth to rural settings. 

As health care becomes increasingly reli- 
ant on Internet-based telecommunications 
technology, the industry faces challenges in 
insuring the quality and integrity of many 
devices and network pathways. These chal- 
lenges differ from previous medical device 
concerns, because the diversity and reliability 
of household equipment is under the control 
of the household, not the health care provid- 
ers. There is an increased interdependency 
between the providers of health services, those 
who manage telecommunication infrastruc- 
ture and the manufacturers of commercial 
electronics. Insuring effective use of telehealth 
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for home and community-based care requires 
that clinical services be supported by appro- 
priate technical resources. The challenge 
of the digital divide that highlights varying 
degrees of access for patients to infrastruc- 
ture and tools necessary for telehealth, must 
be addressed when designing and implement- 
ing such systems. This consideration is neces- 
sary to ensure that telehealth systems, meant 
to bridge geographic distance and increase 
access, do not end up further exacerbating 
inequities and raising additional barriers to 
high quality care. Additionally, consumer 
education is necessary so that patients and 
families fully understand risks and benefits of 
using telehealth software and hardware inte- 
grated into the care they receive. Educational 
initiatives need to address a wide spectrum of 
consumers’ literacy and health literacy but 
also data literacy, namely consumers’ ability 
to process, extract meaning and communicate 
knowledge generated by data. 


20.4.2 Licensure and Economics 
in Telehealth 


Licensure is frequently cited as the single 
biggest problem facing telemedicine involv- 
ing direct patient-provider interactions. This 
is because medical licensure in the United 
States is state-based, while telemedicine fre- 
quently crosses state or national boundaries. 
The debate revolves around the questions 
of whether the patient “travels” through the 
wire to the clinician, or the clinician “travels” 
through the wire to the patient. Several states 
have passed legislation regulating the manner 
in which clinicians may deliver care remotely 
or across state lines. Some states have enacted 
“full licensure models” that require practitio- 
ners to hold a full, unrestricted license in each 
state where a patient resides. Many of these 
laws have been enacted specifically to restrict 
the out-of-state practice of telemedicine. To 
limit Web-based prescribing and other types 
of asynchronous interactions, several states 
have enacted or are considering regulations 
that would require a face-to-face encoun- 
ter before any electronically delivered care 
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would be allowed. In contrast, some states 
are adopting regulations to facilitate tele- 
health by exempting out-of-state physicians 
from in-state licensure requirements provided 
that electronic care is provided on an irregu- 
lar or episodic basis. Still other models would 
include states agreeing to either a mutual 
exchange of privileges, or some type of “reg- 
istration” system whereby clinicians from out 
of state would register their intent to practice 
via electronic medium. 

At the same time, national organizations 
representing a variety of health care profes- 
sions (including nurses, physicians and physi- 
cal therapists) have proposed a variety of 
approaches to these issues. While the exist- 
ing system is built around individual state 
licensure, groups that favor telemedicine have 
proposed various interstate or national licen- 
sure schemes. The Federated State Board of 
Medical Examiners has proposed that physi- 
cians holding a full, unrestricted license in any 
state should be able to obtain a limited tele- 
medicine consultation license using a stream- 
lined application process. The American 
Medical Association is fighting to maintain 
the current state-based licensure model while 
encouraging some reciprocity. The American 
Telemedicine Association supports the posi- 
tion that—since patients are “transported” 
via telemedicine to the clinician—the practi- 
tioner need only be licensed in his or her home 
state. The National Council of State Boards 
of Nursing has promoted an Interstate Nurse 
Licensure Compact (NLC) whereby licensed 
nurses in a given state are granted multi-state 
licensure privileges and are authorized to 
practice in any other state that has adopted 
the compact. By 2015, 25 states had adopted 
the NLC; in 2017 the eNLC went into effect as 
an enhanced compact that addressed the chal- 
lenge of uniform criminal background checks. 
As of early 2020, 34 states had enacted the 
eNLC. 

The second factor limiting the growth of 
telehealth is reimbursement. Prior to the mid- 
1990s there was virtually no reimbursement 
for telehealth outside of teleradiology. For 
many years Medicare routinely reimbursed 
for synchronous video only for rural patients. 
In January 2015, the Centers for Medicare 


and Medicaid Services (CMS) created a new 
chronic care management (CCM) code that 
provides for non-face-to-face consultation 
which introduces options for reimburse- 
ment for asynchronous remote monitoring. 
Furthermore, starting in 2018 CMS allowed 
providers to get reimbursed separately for 
time spent on the collection and interpreta- 
tion of health-related data that were gener- 
ated remotely. As technologies advance and 
play a pervasive role in our health care sys- 
tem, we anticipate the incremental changes in 
telehealth legislation to accelerate. In 2017, 
210 telehealth related bills were active across 
thirty states and as new technological capa- 
bilities are introduced, we anticipate further 
legislative efforts. Over 25 states have laws 
mandating private insurers to reimburse for 
telehealth services (with at least another 10 
additional states have pending or proposed 
laws to do so). Numerous insurers provide 
reimbursement for electronic messaging and 
online consultations. The Medicare Telehealth 
Parity Act of 2015 has led to advancements 
in reimbursement for teleradiology and tele- 
dermatology including payments for store- 
and-forward telehealth; however, restrictions 
still apply to types of technologies used, ser- 
vices provided and populations covered. Few 
groups have even considered reimbursement 
for telehealth services that do not involve 
patient-provider interaction. An expert sys- 
tem could provide triage services; tailored 
on-line educational material, or customized 
dosage calculations. Such systems are expen- 
sive to build and maintain, but only services 
provided directly by humans are currently 
reimbursed by insurance. 

Historically, patients have been perceived 
as reluctant to pay directly for telehealth ser- 
vices, especially when face-to-face visits were 
covered by insurance. This trend is changing 
as consumers are more familiar with the use 
of various technologies to bridge geographic 
distance and embrace various innovative tools 
to bridge geographic distance. In a 2017 study 
(Chang et al. 2017) estimating US house- 
holds’ willingness to pay for telehealth, the 
representative household was willing to pay 
$4.39 per month for telehealth. This valuation 
increased for household with higher opportu- 
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nity costs and even more for households living 
more than 20 miles away from their nearest 
medical facility (to $6.22 per month). 

Finally, home telehealth monitoring may 
reduce the health care costs associated with 
unreimbursed hospital readmissions. For 
example, some insurance payers do not reim- 
burse for hospital readmissions that occur 
within 30 days of discharge, and there are 
anecdotal reports of health systems paying for 
32 days of home monitoring post-discharge. 
Determining whether, and how much, to pay 
for telehealth services will likely be a topic 
of debate for years to come. Starting in 2020 
Accountable Care Organizations (ACOs) with 
Medicare fee-for-service beneficiaries will 
have the option to expand telehealth services 
to include the home as an eligible originat- 
ing site (without being subject to the cur- 
rent Medicare geographic requirements for 
the telehealth originating site). Public health 
developments affect the legal landscape as 
the recent COVID-19 pandemic highlighted 
where several of the regulatory and licensing 
barriers were temporarily lifted. The regula- 
tory landscape will evolve over time as both 
technology and medical knowledge advance 
and societal needs change. 


20.4.3 Logistical Requirements 
for Implementation 
of Telehealth Systems 


Telehealth systems must be carefully evalu- 
ated before implementation for routine use 
in individual disease situations, to ensure 
that they have sufficient diagnostic accuracy 
and reproducibility for clinical application. 
Appropriate training and credentialing stan- 
dards must be developed for personnel who 
capture clinical data and images from patients 
locally, as well as for physicians and nurses 
who perform remote interpretation and con- 
sultation. Clear rules and responsibilities 
must be developed for remote patient man- 
agement, including the appropriate response 
for situations in which data are felt to be of 
insufficient quality for telehealth. Guidelines 
for medicolegal liability must be established. 
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Software that displays clinical information 
required for remote management, and that 
integrates into existing workflow patterns and 
maximizes efficiency through good usability 
principles will be required. More specifically 
in the context of usability, when patients are 
asked to utilize telehealth equipment, several 
factors should be taken into consideration 
such as patients’ previous experience with 
and comfort in using technology, potential 
functional or cognitive limitations, and the 
availability of family members or informal 
caregivers who may be able to assist. Consider 
our example of Samuel who lives alone and 
has some visual limitations. The decision to 
use software and/or hardware for home-based 
monitoring should address Samuel’s living 
arrangements and residential infrastructure, 
as well as his ability and willingness to oper- 
ate the telehealth system (and the potential 
role his neighbor may play who is involved in 
Samuel’s care). Methods for providing added 
value from technology toward telehealth diag- 
nostic systems through strategies such as links 
to consumer health resources or computer- 
based diagnosis may be explored (Koreen 
et al. 2007). Finally, studies have suggested 
that patient satisfaction with telehealth sys- 
tems is high (Lee et al. 2010). However, the 
practitioner-patient relationship is fundamen- 
tal to health care delivery, and mechanisms 
must be developed that this bond is not lost 
from telehealth. 


20.4.4 Telehealth in Low Resource 
Environments 


In many parts of the developing world, the 
density of both health care providers and of 
technology is quite low. Thus, the demand for 
telehealth is high, but the ability to deliver it 
is challenged. Many of these regions have 
largely skipped traditional land-line telephony 
and moved directly to cellular infrastructure 
(Foster 2010). This, combined with advances 
in low-cost laptop computers that do not 
depend on stable power-grids, has allowed the 
development of a wide variety of telehealth 
and tele-education applications. The majority 
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of these are based on an asynchronous model. 
Transport media range from standard broad- 
band in the urban areas, to satellite connec- 
tions, to cellular data, to SMS messaging. The 
largest group of applications focuses on the 
provision of remote consultations for difficult 
cases using computer-based systems, while gen- 
eral health education and remote data collec- 
tion have been the primary applications using 
cellular telephony applications. However, the 
development of smart mobile telephones with 
high-resolution cameras is rapidly blurring this 
distinction. Successful implementation of tele- 
health in various low resource settings is dem- 
onstrated in a series of projects captured by an 
e-book by Wootton and Bonnardot (2015). 


20.4.5 Future Directions 


Telehealth validation studies across a range 
of clinical domains have demonstrated good 
diagnostic accuracy, reliability, and patient 
satisfaction. Based on these results, numer- 
ous real-world telehealth programs have been 
implemented throughout the world. In the 
long term, successful large-scale expansion 
of these programs will require addressing the 
above challenges. 

Beyond these practical factors, traditional 
medical care uses a workflow model based on 
synchronous interactions between clinicians 
and individual patients. The workflow model 
is also a sequential one in that the clinician 
may deal with multiple clinical problems or 
data trends but only within the context of 
treating a single patient at a time. Medical 
records, both paper and electronic, as well as 
billing and administrative systems all rely on 
this sequential paradigm, in which the funda- 
mental unit is the “visit.” Advances in tele- 
health are disrupting this paradigm. Devices 
have been developed that allow remote elec- 
tronic monitoring of diabetes, hypertension, 
asthma, congestive heart failure (CHF), and 
chronic anticoagulation. As a result, clini- 
cians may become inundated by large vol- 
umes of electronic results. This may mean 
that clinicians will no longer function in an 
assembly-line fashion, but will become more 
like dispatchers or air-traffic controllers, elec- 


tronically monitoring many processes simulta- 
neously. Clinicians will no longer ask simply, 
“How is Mrs. X today?” They will also ask the 
computer “Among my 2,000 patients, which 
ones need my attention today?” Neither clini- 
cians, nor EHRs, are prepared for this change. 
Many service industries such as the travel 
and transportation sectors have more recently 
experienced dramatic changes due to the con- 
cept of “shared economy” that promotes a 
shift from strictly regulated frameworks for 
transactions to decentralized approaches 
where community networks promote identify- 
ing and optimizing resources based on needs 
identified by the community members. This 
paradigm shift has started to also permeate 
the health care field introducing an expanded 
perspective of telehealth whereby consumers 
use an app to arrange for on-demand home 
visits or “virtual visits” enabled by videocon- 
ferencing. One example is the Pager app that 
helps consumers find a doctor who will con- 
tact and visit within a guaranteed two-hour 
window. Similarly, various apps like Mend 
or HealApp arrange for on-demand video- 
consultations or visits. Other examples of 
sharing economy apps utilize crowd sourcing 
to support diagnosis processes. These trends 
introduce opportunities and challenges as 
we consider the extent to which regulatory 
and quality safeguards may be necessary to 
maximize benefits and reduce unintended 
consequences. Furthermore, the use of wear- 
ables and smart home technologies expands 
the traditional models of telehealth so that in 
the near future comprehensive telehealth sys- 
tems may integrate physiological, behavioral, 
social, cognitive, environmental and genomic 
data sources to deliver “precision medicine” 
in a continuum of care. Artificial intelligence 
and predictive analytics will play a key role in 
this expansion of the telehealth paradigm. 
Perhaps the greatest long-term effect of 
the information and communication revo- 
lution will be the breaking down of role, 
geographic, and social barriers. Medicine is 
already benefiting from this effect. Traditional 
“doctors and nurses” are collaborating with 
public health professionals, and anyone with 
computer access can potentially communicate 
with patients or experts around the world. The 
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challenge will be to facilitate productive col- 
laborations among patients, their caregivers, 
biomedical scientists, and information tech- 
nology experts to promote patient engage- 
ment and shared decision making. 


(e) Suggested Readings 

Bashshur, R. L., Shannon, G. W., Krupinski, 
E. A., Grigsby, J., Kvedar, J. C., Weinstein, 
R. S., et al. (2009). National telemedicine ini- 
tiatives: Essential to healthcare reform. 
Telemedicine and e-Health, 15, 600—610. This 
paper discusses cost-benefit tradeoffs associ- 
ated with telemedicine within the context of 
large-scale efforts promoting health care 
reform in the United States. 

Chi, N. C., & Demiris, G. (2015). A systematic 
review of telehealth tools and interventions to 
support family caregivers. Journal of 
Telemedicine and Telecare, 21, 37—44. This 
review focuses on telehealth applications tar- 
geting either solely the family caregiver of a 
patient or the dyad (caregiver and patient) 
examining the impact of telehealth on care- 
giver outcomes. Six categories of telehealth 
interventions for caregivers were identified: 
education, consultation (including decision 
support), psychosocial/cognitive behavioral 
therapy, social support, data collection and 
monitoring, and clinical care delivery. Studies 
demonstrate caregiver satisfaction as well as 
reduction of caregiver anxiety and burden. 

Reed, M. E., Parikh, R., Huang, J., Ballard, 
D. W., Barr, I., & Wargon, C. (2018). Real- 
time patient-provider video telemedicine inte- 
grated with clinical care. New England Journal 
of Medicine, 379, 1478-1479. Kaiser 
Permanente Northern California began offer- 
ing telemedicine visits enabling patients to use 
videoconferencing on a mobile phone, com- 
puter or tablet to communicate with their phy- 
sicians. In this study 210,383 video visits 
conducted over three years (involving 2796 
primary care providers and 152,809 patients) 
were examined. The telemedicine visits 
extended established patient—physician rela- 
tionships and led to high levels of patient sat- 
isfaction (93% of surveyed patients responded 
that the video visit met their needs). 

Wechsler, L. R., Demaerschalk, B. M., Schwamm, 
L. H., Adeoye, O. M., Audebert, H. J., Fanale, 
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C. V., Hess, D. C., Majersik, J. J., Nystrom, 
K. V., Reeves, M. J., Rosamond, W. D., 
Switzer, J. A., & American Heart Association 
Stroke Council; Council on Epidemiology and 
Prevention; Council on Quality of Care and 
Outcomes Research. (2017). Telemedicine 
quality and outcomes in stroke: A scientific 
statement for healthcare professionals from 
the American Heart Association/American 
Stroke Association. Stroke, 48, e3-e25. This is 
an updated systematic evidence-based review 
of scientific data examining the use of tele- 
medicine for stroke care delivery. Published 
studies are categorized according to their level 
of certainty and class of evidence. 


Q Questions for Discussion 


1. Telehealth has evolved from systems 
designed primarily to support 
consultations between clinicians to 
systems that provide direct patient 
care. This has required changes in 
hardware, user interfaces, software, 
and processes. Discuss some of the 
changes that must be made when a 
system designed for use by health care 
professionals is modified to be used 
directly by patients. 

2. There are still some challenges regarding 
reimbursement for telemedicine services. 
Imagine that you are negotiating with an 
insurance carrier to obtain reimburse- 
ment for a store-and-forward telemedi- 
cine service that you have developed. 
The medical director of the second 
insurance payer states: “Telemedicine 
seems like ‘screening’ rather than a 
mechanism for delivering health care. 
This is because you are simply using 
technology to identify patients who need 
to be referred to a real doctor, rather 
than providing true medical care. 
Therefore, we should only reimburse a 
very small amount for these screening 
services.” In your opinion, is this a legit- 
imate argument? Explain. 

3. Using telehealth systems, patients can 
now interact with multiple health care 
stakeholders and monitor multiple 
aspects of their health generating large 
data sets. In order to inform timely and 
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tailored interventions based on data 
generated both during clinical encoun- 
ters and outside the clinical setting, 
many propose for telehealth enabled 
patient generated health data to be 
directly integrated into the Electronic 
Health Record. Discuss both challenges 
and opportunities for such an approach. 

4. A significant barrier to widespread tele- 
health adoption has been limited valida- 
tion studies demonstrating that its 
diagnostic accuracy is comparable to that 
of traditional in-person medical care. Do 
you feel this is a realistic goal, given the 
extremely large number of potential dis- 
ease states and clinical scenarios that may 
require validation studies? Are there 
alternate scenarios that could lead to tele- 
health becoming accepted as standard 
medical practice? Explain. 

5. Home telehealth often requires 
interpretation of data collected directly 
by patients, which may create 
challenges because of concerns about 
accuracy, as well as challenges from a 
data management perspective because 
of the large volume of incoming data. 
Describe possible approaches toward 
addressing these challenges involving 
accuracy and data management. 
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© Learning Objectives 

After reading this chapter, you should be 

able to answer these questions: 

1. What is patient monitoring and why is it 
used? 

2. What patient parameters do bedside 
physiological monitors track? 

3. What are the major problems with ac- 
quisition and presentation of monitor- 
ing parameters? 

4. In addition to bedside physiologic pa- 
rameters, what other information is 
fundamental to the care of acutely ill 
patients? 

5. Why is real-time computerized deci- 
sion support potentially more benefi- 
cial than monthly or quarterly quality- 
of-care reporting? 

6. What technical and social factors must 
be considered when implementing real- 
time data acquisition and decision sup- 
port systems? 


21.1 What is Patient Monitoring? 
Life, from the physiologic standpoint requires 
supplying oxygen to tissues to address the 
metabolic needs for the purpose of fueling 
mitochondrial respiration in cells. When this 
cycle is broken humans become critically ill. 
That physiologic cycle could be controlled 
through oxygenation and perfusion monitor- 
ing but there are no direct methods existing 
to measure mitochondrial respiration. All 
modern monitoring methods are proxy meth- 
ods for such processes. In hospitals, and espe- 
cially in the intensive care unit (ICU), patient 
monitoring becomes critical for control and 
optimization of hemodynamic, ventilation, 
temperature, nutrition, and metabolism of the 
human body. 

Measurement of patient physiologic 
parameters such as heart rate (HR), heart 
rhythm, arterial blood pressure (ABP), respi- 
ratory rate, and blood-oxygen saturation, have 
become common during the care of the hos- 
pitalized and, especially, critically ill patients. 
When accurate and prompt decision making 
is crucial for effective patient care, bedside 
monitors are used to collect, display, and store 
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physiologic data. Increasingly, such data are 
collected by noninvasive sensors connected to 
patients in ICUs, neonatal ICUs, operating 
rooms (ORs), labor and delivery suites, emer- 
gency departments, and other hospital care 
units where patient acuity is increased. 

We often think of a patient monitor as 
something that watches for, and warns about, 
serious or life-threatening events in patients, 
and provides guidance for care of the criti- 
cally ill. Such systems must include continu- 
ous observations of a patient’s physiologic 
measurements and the assessment of the 
function of attached life support equipment. 
Such monitoring is important in detecting 
life-threatening conditions and guiding man- 
agement decision making, including when to 
make therapeutic interventions and to assess 
the effect of those interventions. 

In this chapter, we discuss the use of com- 
puters in collecting, displaying, storing, and 
interpreting clinical data, making therapeutic 
recommendations, and alarming and alerting. 
In the past, most monitoring data (called vital 
signs) were in the form of HR and respira- 
tory rate, blood pressure (BP), and body tem- 
perature. However, today’s ICU monitoring 
systems are able integrate data from bedside 
monitors and devices, as well as data from 
many sources outside the ICU. Although 
the material presented here deals primarily 
with patients who are in ICUs, the general 
principles and techniques are also applicable 
to other hospitalized patients and electronic 
medical records (EMRs). Patient monitor- 
ing is performed extensively for diagnostic 
purposes in the emergency department or for 
therapeutic purposes in the OR. Techniques 
that initially were only used in the ICU such 
as bedside monitors are now used routinely on 
general hospital wards and in some situations 
even by patients in their homes. 


21.1.1 Case Report 


This case report provides a perspective on the 
problems faced by the team caring for a criti- 
cally ill patient. 

A 27-year-old man is injured in an auto- 
mobile accident and has multiple chest and 
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head injuries. His condition was stabilized at 
the scene of the accident by skilled paramed- 
ics using a portable computer-based electro- 
cardiogram (ECG) and pulse oximeter, and 
he is quickly transported to a trauma center. 
Once in the trauma center, he is connected 
via noninvasive sensors to a computer-based 
bedside monitor that displays physiologic sig- 
nals, including his HR and rhythm, arterial 
oxygen saturation, and BP. Radiographic and 
magnetic resonance imaging provide further 
information for care. 

Because of the severe chest injury, the 
patient has difficulty breathing, so he is 
connected to a computer-controlled venti- 
lator that has both therapeutic and moni- 
toring functions and he is transferred to the 
ICU. Because of the head injury a bolt is 
placed in a hole drilled through his skull and 
a fiber optic sensor is inserted to continuously 
measure intracranial pressure with another 
computer-controlled monitor. Blood is drawn 
and clinical chemistry and blood gas tests are 
promptly performed by the hospital labora- 
tory. Results of those tests are displayed to the 
ICU team as soon as they are available. With 
intensive treatment, the patient survives the 
initial threats to his life and he now begins the 
long recovery process. @ Figure 21.1 shows a 
nurse at the patient’s bedside surrounded by a 
bedside monitor, infusion pumps, a ventilator, 
and other devices. 


O Fig. 21.1 Overall view of an ICU patient’s room. 
Shown is a nurse standing at the bedside computer 
screen (/eft), a ventilator (center), and a respiratory ther- 
apist suctioning the patient (right). The patient is con- 
nected to the ventilator, bedside monitor (upper right), 
and to three IV pumps (lower right). ICU indicates 
intensive care unit, IV intravenous 


Unfortunately, a few days later, the patient 
is beset with a problem common to multiple 
trauma victims—he develops a major nosoco- 
mial hospital-acquired infection, sepsis, and 
acute respiratory distress syndrome (ARDS). 
Multiple organ failure follows. As a result, 
antibiotics and electrolytes are required for 
treatment and are dispensed via intravenous 
(IV) pumps. The quantity of information 
required to care for the patient has increased 
dramatically; with monitor data excluded, 
critically ill patients produce in average 10 
times more clinical data than patients on the 
typical general hospital unit (@ Fig. 21.2) 
(Herasevich et al. 2012). 

Decision-making for patients with com- 
plex rapidly changing acute conditions 
requires much data in addition to monitoring 
data (B Fig. 21.3) (Bradshaw et al. 1984). 

Increased data flow, rapid changes in 
the patient’s state, and a multidisciplinary 
approach involving different care providers 
required the next generation of data manage- 
ment clinical systems outside of distributed, 
often not connected, multiple patient moni- 
tors and data from hospital EMRs. 

In > Sect. 21.6 (“Monitoring and 
Advanced Information Management in 
ICUs”) we describe two integrated patient 
monitoring systems: the HELP system used 
by Intermountain Healthcare’s Hospitals and 
the AWARE system developed at Mayo Clinic 
Hospitals. Both of these systems integrate 
diverse clinical data for complex decision 
making. 


21.1.2 Patient Monitoring 


Careful monitoring and alerting care team 
about changes in a patient’s physiologic status 
is a vital part of diagnostic and therapeutic 
processes. There are at least three categories 
of patients who need physiologic monitoring: 
1. Patients with compromised physiologic 
regulatory systems; eg, a patient whose 
respiratory system is suppressed by a drug 
overdose or during anesthesia 
2. Patients who are currently stable but with 
a condition that could suddenly change to 
become life threatening; eg, a patient who 
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O Fig. 21.2 Average number of total clinical data points per patient hour, excluding vitals, before and after admis- 
sion to the ICU (time zero). Bars in red indicate first hours in ICU. Y-axis indicated number of data points 


Drugs I/O ___ Observation 


Monitor 
13% 


O Fig. 21.3 Six main data categories and their relative 
distribution used in clinical decision making in trauma 
shock intensive care unit. I/O indicates Intake and Output 


has findings indicating an acute myocar- 
dial infarction (heart attack) or immedi- 
ately after open-heart surgery, or a fetus 
during labor and delivery 

3. Patients in a critical physiologic state; eg, 
patients with multiple trauma or septic 
shock like in our case study 


Clinical monitoring has evolved over the time 
and has a tendency to change from intermit- 
tent to continuous and from invasive to non- 
invasive methods. Also, there is a trend for 
monitoring systems to become multipurpose 
and integrate multiple parameters, including 
nonmonitoring data. Current stand-alone 
bedside monitors have data storage and are 


capable of capturing multiwaveform/multipa- 
rameter information with advanced alerting 
functionality. 

In general, patient monitoring could be 
divided to five groups (@ Fig. 21.4): 
1. Hemodynamic monitoring 
2. Respiratory monitoring 
3. Neuromonitoring 
4. Metabolic monitoring 
5. Specialty monitoring 


Not all monitoring techniques needed for 
every patient or are widely available. Also, 
using invasive methods should take into 
account the risk of complications. 

In terms of technologic classification, 
monitoring could be divided by three general 
parts: 

1. Vital signs monitoring 
2. Diagnostic monitoring 
3. Specialized or disease-specific monitoring 


Historical Perspective 
and the Measurement 
of Vital Signs 


The earliest foundations for acquiring 
physiologic data occurred at the end of 
the Renaissance period. In 1625, Santorio 
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O Fig. 21.5 Pulsilogium at center to measure HR and 
thermoscope (right) to measure body temperature. HR 
indicates heart rate. (From Sanctorius (1626)) 


Santori, who lived in Venice, Italy, published 
his methods for measuring body temperature 
with the spirit thermometer and for timing the 
pulse (HR) with a pendulum (Ø Fig. 21.5). 
The principles for both devices had been 
established by Galileo Galilei, a close friend. 
Galileo worked out the uniform periodicity of 
the pendulum by timing the movement of the 
swinging chandelier in the Cathedral of Pisa 
and comparing that to his own pulse rate. The 
results of this early biomedical-engineering 
collaboration, however, were ignored. The 
first scientific report of the pulse rate did not 
appear until English physician Sir John Floyer 
published “Pulse-Watch” in 1707. The first 
published course of fever for a patient was 
plotted by Ludwig Taube in 1852. 

In 1896, Scipione Riva-Rocci introduced 
the sphygmomanometer (BP cuff), which per- 
mitted the fourth vital sign, systolic BP, to 
be measured. A Russian physician, Nikolai 
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Korotkoff, applied Riva-Rocci’s cuff with a 
stethoscope developed by the French phy- 
sician Rene Laennec, which allowed the 
measurement of both systolic and diastolic 
arterial pressure. Harvey Cushing, a preemi- 
nent US neurosurgeon of the early 1900s, 
predicted the need for and later insisted on 
routine ABP monitoring in the OR. At the 
same time, Cushing also raised the following 
questions, which are still being asked today: 
1. Are we collecting too much data? 
2. Are the instruments used in clinical medi- 
cine too accurate? 
3. Would not approximated values be just as 
good? 


Cushing (1903) answered his own questions 
by stating that vital sign measurements should 
be made routinely and that their accuracy 
was important. In 1903, Willem Einthoven 
devised the string galvanometer for display- 
ing and quantifying the ECG, for which he 
was awarded the 1924 Nobel Prize in physi- 
ology. The ECG has become an important 
adjunct to the clinician’s inventory of tests 
for both acutely and chronically ill patients. 
Continuous measurement of physiologic vari- 
ables has become a routine part of the moni- 
toring of critically ill patients. 

At the same time that advances in moni- 
toring were made, major changes in the ther- 
apy of life-threatening disorders were also 
occurring. Prompt quantitative evaluation of 
measured physiologic and biochemical vari- 
ables became essential in the decision-making 
process as physicians applied new therapeutic 
interventions. For example, it is now possible, 
and in many cases essential, to use ventilators 
when a patient cannot breathe independently, 
for cardiopulmonary bypass equipment when 
a patient undergoes open-heart surgery, 
hemodialysis when a patient’s kidneys fail, 
and IV nutritional and electrolyte support 
when a patient is unable to eat or drink. 

Since the 1920s, the four vital signs—tem- 
perature, respiratory rate, HR, and ABP— 
have been recorded in all patient charts and 
became the standard vital signs. In recent 
years a fifth vital sign, oxygen saturation, was 
added as a routine measurement. 
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21.3 Development of ICUs 


Care of critically ill patients requires prompt 

and accurate decisions so that life-protecting 

and life-saving therapy can be appropriately 

applied. Because of these requirements, ICUs 

have become widely established in hospitals. 

Such units use computers almost universally 

for the following purposes: 

1. To acquire physiologic data frequently or 
continuously, such as BP 

2. To acquire information from remote data- 
producing systems to remote locations, eg, 
laboratory and radiology departments 

3. To store, organize, and report patient 
information 

4. To integrate, organize and correlate data 
from multiple sources 

5. To provide clinical alerts and advisories 
based on multiple sources of data 

6. To function as an automated decision sup- 
port tool that health professionals may use 
in planning the care of critically ill patients 

7. To measure the severity of illness for 
patient classification purposes 

8. To analyze the outcomes of ICU care in 
terms of clinical effectiveness and cost 
effectiveness 


Until about 1960, if patients had severe cardiac 
events, there were few treatment options avail- 
able for physicians to provide care for them. 
As a consequence, many patients who had 
life-threatening acute cardiac or pulmonary 
problems died. However, in the early 1960’s 
two major medical care treatment modali- 
ties were developed that provided treatment 
for previously fatal situations. Development 
of closed chest cardiopulmonary resuscita- 
tion (CPR); (Kouwenhoven et al. 1960) and 
closed chest defibrillation (Zoll et al. 1956; 
Lown et al. 1962) provided means for deliv- 
ering life-saving treatment. Because of avail- 
ability of these treatments, the demand for 
continuous monitoring of high-risk patients 
escalated. Hospitals began to cluster patients 
with complex disorders together into new 
organizational units, called ICUs, beginning 
in the early 1960s. Some of the earliest units 
were coronary care units where patients were 


cared for after myocardial infarctions or other 
acute, life-threatening cardiac events. 

Surgical ICUs had their beginnings in 
the late 1950s when postoperative patients 
were kept in the recovery rooms for extended 
time periods after cardiac or other high-risk 
surgery for close observation. Initially these 
recovery rooms did not have the benefit of 
cardiac monitoring. However, as more sophis- 
ticated monitoring became available, special 
units were created and designated as surgical 
ICUs or thoracic ICUs. 

ICUs proliferated rapidly during the late 
1960s and early 1970s. The types of units 
included coronary, thoracic surgery, surgi- 
cal, medical, shock-trauma, burn, pediatric, 
neonatal, respiratory, and other multipur- 
pose medical or surgical units. Today there 
are more than six million patients admitted 
each year into adult, pediatric, and neonatal 
ICUs in the United States alone. In the past 
three decades, the demand for ICU services 
in the United States has risen dramatically. 
Because of the complexity of care and the 
increased acuity of these patients, the need for 
specialized nursing care has increased dramat- 
ically. In a typical non-ICU acute patient care 
situation, one nurse may be responsible for 
the care up to six patients. However, because 
of the observations and care that these acutely 
ill ICU patients require, intensive care nurses 
typically are assigned one to three patients. 

The average life expectancy is rising and 
estimates of the US population aged over 
65 years (who use ICUs disproportionally 
more than the rest of the population) is esti- 
mated to increase by 50% by 2020 and 100% 
by 2030, thus continually increasing demand 
(Kelley et al. 2004; Groves et al. 2008) for 
ICU-level care. 


21.4 Development of Bedside 
Monitors 


A signature feature of each of these early 
ICUs was the bedside monitor (@ Fig. 21.6). 
The original bedside monitors were used pri- 
marily to acquire and display the ECG. The 
first modern bedside monitor, ICU 80, was 
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O Fig. 21.6 Waveforms on typical bedside monitor. 
Displays from the monitor show the real-time beat-by 
ECG, pulse oximeter, and arterial, pulmonary artery, 


introduced by the Nihon Kohden company 
from Japan in 1967 and was as large as the 
patient bed. 

As a result of the detailed ECG informa- 
tion provided by the new patient monitors, 
treatment for serious cardiac arrhythmias 
(heart rhythm disturbances) and cardiac 
arrest (abrupt cessation of heartbeat)—major 
causes of death after myocardial infarctions 
(heart attack)—became possible. Mortality 
rates from 1960 to 1970 were about 35%, 
dropped to about 23% between 1970 and 1980 
and to about 20% between 1980 and 1990. 
During the 1990s reperfusion of the coro- 
nary arteries became common and mortality 
rate dropped to about 5% (Braunwald 1988; 
Rogers et al. 2000). 

In the 1960s, bedside monitors were built 
using analog computer technologies. These 
systems amplified the ECG signal and dis- 
played the results on an oscilloscope. Such sys- 
tems required nurses or technicians to watch 
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and central venous blood pressures and their derived 
measures. (Courtesy of Royal Philips, with permission) 


the oscilloscope to determine if there was a 
cardiac arrest or other life-threatening cardiac 
rhythm. Soon after these analog systems were 
developed, methods for generating high- and 
low-HR alarm thresholds were included. The 
alarms were usually audible and very annoy- 
ing. Unfortunately, since the beginning of the 
use of these alarms, the false-positive rate has 
far exceeded the true positive rate. As a result, 
many times alarm systems for bedside moni- 
tors are ignored or turned off. The problem 
of alert fatigue in hospitals is still considered 
a challenging technological problem and con- 
tributes to poor patient outcomes including 
deaths. 

Teams from several cities in the United 
States in the 1960s introduced computers into 
the ICU to assist in physiologic monitoring, 
beginning in Los Angles with Shubin and 
Weil (1966) followed by Warner et al. (1968) 
in Salt Lake City. These investigators had sev- 
eral objectives: 
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1. To increase the availability and accuracy 
of the physiologic data 

2. To compute derived variables that could 
not be measured directly 

3. To increase patient-care efficacy 

4. To allow display of the time trends of the 
patient’s physiologic data, and 

5. To assist in computer-aided decision 
making 


Each of these teams developed their applica- 
tions on large mainframe computer systems, 
which required large computer rooms and 
trained staff to keep the system operational 
24 hours a day. The computers used by these 
developers cost over $200,000 each in 1965 
dollars. During that time, other research- 
ers were attacking more specific challenges 
in patient monitoring. For example, Cox 
(1972) at Barnes Hospital in St. Louis, devel- 
oped algorithms to analyze the ECG for 
heart rhythm disturbances in real-time. The 
arrhythmia-monitoring system, which was 
installed in the Coronary Care Unit (CCU) 
in 1969, ran on a relatively inexpensive mini- 
computer rather than a mainframe computer. 
With the advent of integrated circuits and 
microprocessors, affordable computing power 
increased dramatically. What was considered 
computer-based patient monitoring by these 
pioneers in the late 1960s and early 1970s 
is now entirely built into bedside monitors 
and is considered simply a bedside monitor. 
Clemmer (2004) provides an important over- 
view of “where we started and where we are 
now” to summarize the first four decades fol- 
lowing the initiation of computers in the ICU. 


21.5 Modern Bedside Monitors 


The heart and lungs are crucial to normal 
body function. For example, if the heart 
stops (cardiac arrest) there is a cessation of 
normal circulation of the blood. Likewise, if 
there is a pulmonary arrest there is a cessation 
of breathing. Each of these situations leads 
to a reduced delivery of oxygenated blood 
(hypoxia) to the body, with major physiologic 
hazards. For example, brain injury will occur 


if hypoxia is untreated within 5 minutes. As 
a consequence, detection of either of these 
situations is required if life-saving treatments 
are to be administered. The treatment for car- 
diac arrest is cardiopulmonary resuscitation 
(CPR), which provides circulatory and pul- 
monary support. Prompt use of a defibrillator 
increases the likelihood of reestablishment of 
a normal rhythm. 

The typical bedside monitor can also 
display the ECG and the arterial waveform 
(B Fig. 21.6). 


21.5.1 ECG Signal Acquisition 


and Processing 


The ECG provides a representation of the 

electrical activity of the human heart and is 

a very important tool for the diagnosis of 

disturbances of HR and rhythm. Original 

monitors allowed physicians and nurses the 
ability to watch the ECG trace on an oscil- 
loscope. Since ECG signal measured on the 
skin is very small (1 mV), it is subject to arti- 
facts (noise) caused by such things as patient 
movement, electrode movement, and electri- 
cal power interference. By using sophisticated 
analog and digital techniques and present- 
ing data from multiple leads, the quality and 
reliability of the ECG signals monitored has 
improved dramatically (Weinfurt 1990; Gregg 
et al. 2008). At the same time, the demand for 
improved quality of the ECG signal and an 
increase in the number and types of param- 
eters has increased. Initially, the ECG signal 
was processed to obtain HR and basic rhythm 

(periodicity of the beat) while today’s moni- 

tors can detect signals from artificial heart 

pacemakers, complex arrhythmias, myocar- 
dial ischemia and disturbances in the conduc- 
tion of electrical signals through the heart 
muscle. 

Two types of computerized ECG analysis 
are in common use today: 

1. The 12-lead ECG is typically performed in 
a physician’s office or in the hospital. 
Usually a technician brings a recording 
device to the patient’s bedside and attaches 
the leads, and records the signal during a 
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short interval while the patient is lying qui- 
etly in a supine position. From this 12-lead 
ECG, a wide variety of ECG diagnoses are 
made. Computer processing of these ECG 
signals taken at that moment in time has 
become the definitive practical option for 
ECG interpretation. Automated ECG 
analysis has become widespread in clinical 
practice since the mid-1980s although, in 
most hospitals, cardiologists will also read 
them to confirm the automated findings. 
Automated ECG analysis is quite accu- 
rate, especially in normal individuals, but 
disagreements with cardiologists are seen 
and may be clinically important (Guglin 
and Thatai 2006; Bogun et al. 2004). On 
the other hand, cardiologists are not per- 
fect either (Clark et al. 2010)! 

Today, physicians expert in ECG inter- 
pretation from multiple professional orga- 
nizations such as the American Heart 
Association and the Electrocardiographic 
Society have come to consensus and estab- 
lished standards designed to improve 
computerized ECG interpretation. In bio- 
medical informatics terminology, these 
experts have developed the knowledge 
base for diagnostic ECG interpretation. 
The detailed pattern recognition and sig- 
nal processing does not need to occur in 
real-time. Thus the 12-lead ECG process- 
ing can be more sophisticated than with 
the requirements of real time monitoring 
situations (Gregg et al. 2008). 

2. Continuous, real-time monitoring is 
required while the patient is in the 
ICU. Because of patient movement, care- 
giver activities such as administering medi- 
cations, bathing and the like, the amount 
of artifact generated poses important 
challenges to real-time monitoring. To 
minimize these effects, filtering of the 
acquired ECG signal is performed. This 
filtering slightly distorts the ECG but at 
the same time makes it possible to process 
the signals on a beat-by-beat basis. 
Although standards for interpretation of 
ECG monitoring are more recent than 
those for 12-lead monitoring, they are now 
becoming more common and sophisti- 
cated (Drew and Funk 2006; Funk et al. 


703 


2010). The clinical experts who are estab- 
lishing the knowledge base now include 
critical-care nurses, cardiologists, anesthe- 
siologists, and thoracic surgeons (Crossley 
et al. 2011). 

ECG processing in today’s vendor- 
supplied bedside monitors continues to 
improve and become more reliable. 
Sophisticated pattern recognition and sig- 
nal processing techniques are used to allow 
extraction of key parameters in real-time 
while adding the ability to measure the 
utility of new physiologic parameters 
(Crossley et al. 2011). Investigators have 
created publically available databases of 
ECG waveforms and other physiologic sig- 
nals as well as other important data from 
actual patients to allow validation of these 
monitoring systems (Saeed et al. 2011; 
Burykin et al. 2011). 


21.5.2 ABP Signal Acquisition 
and Processing 


Accurate and continuous monitoring of ABP 
requires insertion of a catheter into an artery. 
Once the catheter is successfully inserted into 
an artery, the catheter is connected, via a 
length of sterile fluid-filled tubing, to a stop- 
cock with a continuous flush device and a 
factory calibrated disposable BP transducer 
(Gardner 1996). The BP transducer is then 
connected to an amplifier and the pulsatile 
signal it detects is displayed on the screen of 
the bedside patient monitor. With the advent 
of inexpensive, disposable, accurate pres- 
sure transducers, the quality and accuracy of 
ABP monitoring has improved dramatically. 
However, two sources of inaccuracy of the 
ABP signal still depend on medical staff set- 
up and validation: (1) zeroing is the process 
by which the monitor is informed when a port 
on the stopcock is opened to the atmosphere 
at mid-heart level — thus becoming the point 
from which pressure is measured; and (2) 
since the ABP signal contains pulsatile char- 
acteristics with frequencies up to 20 Hz that 
must be transmitted from the artery through 
the plumbing system to the transducer, the 
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dynamic response characteristics must be 
optimized. Optimization is typically done by 
doing a fast flush test (by pushing sterile saline 
through the tubing) to optimize the system by 
removing blood and very tiny air bubbles that 
can dramatically distort the ABP waveform 
and result in erroneous measures of systolic 
and diastolic pressure. 

At least two types of artifacts in the ABP 
signal are commonly observed. If a patient 
rapidly moves or a care giver bumps the tub- 
ing, a pressure artifact is generated and trans- 
mitted to the transducer and displayed. In 
addition, when the clinical staff draws arterial 
blood for laboratory tests, they typically turn 
off the stopcock connected to the transducer 
and draw blood through the tubing, causing 
an immediate loss of the pulsatile ABP signal. 
The pressure sensed by the transducer then 
typically rises to that found in the pressurized 
flush solution. Thus, continuous vigilance 
on the part of nurses and other care givers is 
needed for the arterial catheter and monitor- 
ing systems to be properly maintained. As a 
historical note, the continuous flush device 
was developed over 50 years ago to prevent 
arterial catheters from clotting and to allow 
one of the pioneering computerized monitor- 
ing systems to become more reliable (Gardner 
et al. 1970). Since that time, investigators have 
developed computerized methods to minimize 
these “human caused artifacts” (Liet al. 2009; 
Gorges et al. 2009). Unfortunately, these 
strategies have seldom been implemented into 
commercially available bedside monitors. 

Since the early 1900s, efforts have been 
made to estimate cardiac output from the pul- 
satile pressure in the arterial system by multi- 
plying the HR with estimates of stroke volume 
(the volume of blood ejected from the heart 
during a single contraction) made from the 
pressure waveform. Warner and his colleagues 
at Mayo Clinic published some early work in 
1953 (Warner et al. 1953) on the topic and 
followed up again in 1983 further substanti- 
ating the feasibility of the method. However, 
Cundick and Gardner (1980) showed that the 
widely varying mean BPs found in critically 
ill patients adversely affected the reliability 
of the method. Since that early work, mul- 
tiple publications and commercially available 


devices using the pulse-pressure method have 
appeared. The issue is still active, with such 
publications as Chen et al. (2009), Sun et al. 
(2009), and Gardner and Beale (2009). Other 
estimates of stroke volume and cardiac output 
have been made from determining the bioreac- 
tance — a measure of the degree of phase-shift 
in the electrical signal—across the chest. This 
method shows promise of being a rather sim- 
ple, continuous and noninvasive method for 
measuring cardiac output (Keren et al. 2007). 

Investigations have made assessments of 
delta pulse pressure, which measures the vari- 
ability of the peak to peak arterial pressure 
pulse signal across the breathing cycle to make 
an estimate of a patient’s fluid balance. The 
supposition is that if there is larger variability 
in this delta pulse pressure marker the patient 
may require fluid administration (Deflandre 
et al. 2008). 

It is clear that future methods that process 
available physiologic signals will be applied 
to enhance and improve the availability of 
important measures of cardiac function, a 
key parameter for making treatment decisions 
used by critical care caregivers. 


21.5.3 Pulse Oximeter Signal 
Acquisition and Processing 


One of the most common technologic devices 
used in hospitals today is the pulse oximeter; 
continuous monitoring of oxygen saturation 
became a standard of practice in the early 
1990s (Brown et al. 1990). The pulse oximeter 
sensing device is usually placed on a finger and 
measures oxygen saturation and pulse rate, or 
HR (Clark et al. 2006). The modern device 
works by shining red and infrared light gener- 
ated by two light emitting diodes through the 
tissue. With each arterial pulse there is a varia- 
tion in the light as it passes through the tissue 
and is sensed by a light-sensitive photodiode 
on the opposite side. The more oxygenated 
the blood is, the more red light is transmit- 
ted, with less infrared light passing through. 
By calibrating these devices, reasonably accu- 
rate estimates of oxygen saturation (SpO,) can 
be determined. Although the pulse oximeter 
is convenient and easy to use, it has several 
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important limitations, including motion arti- 
fact, when the patient moves, and other physi- 
ologic considerations such as anemia, low 
perfusion state and low peripheral skin tem- 
perature. If the blood flow to the hand gets 
disturbed, by perhaps squeezing the arm dur- 
ing BP with a sphygmomanometer, the blood 
flow to the hand is interrupted and the pul- 
satile BP signal required for the pulse oxim- 
eter is no longer available. The pulse oximeter 
is one of ICU monitoring devices with most 
false alarms (Malviya et al. 2000). 


21.5.4 Bedside Data Display 
and Signal Integration 


While colorful and dynamic, the displays 
on the bedside monitor can be complex. 
O Figure 21.6 shows a typical bedside moni- 
tor display as one example. 

In 2018 Medtronic led the US patient mon- 
itoring device market by capturing the largest 
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segment of pulse oximetry monitoring after 
acquiring Covidien, the previous leader of 
pulse oximetry monitoring. With one of most 
extensive product lines, Philips Healthcare is 
leading multiparameter vital sign monitoring, 
wireless telemetry, and fetal/neonatal moni- 
toring markets. 

The other major manufacturers in the 
patient monitoring market are: GE Healthcare, 
Masimo, Edwards Lifesciences, Mindray 
Medical, Natus Medical, Welch Allyn, 
Omron Healthcare, Honeywell Life Sciences, 
Nihon Kohden, Spacelabs Healthcare, St. 
Jude Medical, Nonin Medical and Boston 
Scientific. Each vendor of bedside monitors 
has made a “best effort” at displaying the 
variety of physiologic signals derived. In most 
cases this consists of three channels: ECG, 
ABP, and pulse oximetery. Additional impor- 
tant physiologic parameters can be derived 
from these signals, as noted in @ Table 21.1. 

Today’s bedside monitors still present 
both waveforms and derived parameters in 


© Table 21.1 Bedside physiologic monitoring capabilities 
Modality Transducer Frequency Additional parameters 
ECG Chest Continuous Heart rate Heart Complete Pacemaker 
electrodes rhythm ECG signal 
waveforms 
Arterial blood Catheter & Continuous Heart rate Systolic, Estimates Pulse 
pressure blood diastolic of cardiac pressure 
invasive pressure & mean output variation & 
transducer pressure fluid 
loading 

Pulse oximeter Finger probe Continuous Arterial Heart rate 

oxygen 

saturation 
Temperature Skin sensor Continuous Temperature 
Respiration Chest belt Continuous Respiratory 

rate 
Bioreactance Electrodes Continuous Cardiac Heart rate Stroke 

output volume 
Arterial blood Inflatable Intermittent Heart rate Systolic, 
pressure cuff diastolic 
non-invasive and mean 

pressure 


Abbreviation: ECG electrocardiogram 
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a single-sensor-single-indicator format. That 
is, for each individual sensor attached to the 
patient there is a single indicator, a waveform 
with derived values presented on the screen 

(Drews et al. 2008). One of the simplistic con- 

sequences of this display strategy is that each 

indicator is treated as if it had come from a 

different patient. For example, if ECG, ABP, 

and pulse oximeter signals were displayed, 
they would each have the capability of deter- 
mining HR. Thus, three different HR mea- 
sures might be displayed. Although there are 
physiologic reasons for such differences, the 
most common situation is that the HR should 
be an integrated assessment of the three sig- 
nals since artifact is a far more common event 
than the unusual conditions that would cause 
the differences in HR. Studies suggest that 
there are better methods for designing hemo- 
dynamic monitoring displays (Doig et al. 

2011; Drews et al. 2008). 

A more important problem relates to the 
integration of data from multiple bedside 
devices. Two examples will illustrate the prob- 
lem: 

1. The patient’s pulse oximeter has shown a 
recent increase of SpO,. However, the bed- 
side monitor has no knowledge that the 
respiratory therapist has increased the 
FIO, from 30% to 40% on the ventilator. 

2. The patient’s HR has recently increased 
from a dangerously low value of 45 beats 
per minute to 72 beats per minute. 
Unfortunately, the bedside monitor has no 
way of knowing that a nurse has increased 
the drip rate of a cardioactive medication. 


Patients in today’s ICUs can have 50 or more 
electronic devices attached (Mathews and 
Pronovost 2011). Many of these electronic 
devices were developed by independent com- 
panies and do not easily interface or commu- 
nicate with each other. However, even though 
the larger monitoring companies have pur- 
chased several of the specialty monitoring 
companies, problems still exist although it was 
understood more than three decades ago, and 
standards for bedside data interchange (CEN 
ISO/IEEE 11073) (Gardner et al. 1989, 1991) 
were developed. The Medical Information 


Bus (MIB) is the simple term used to desig- 
nate CEN ISO/IEEE 11073. So, why has the 
MIB been a commercial failure to this point? 
There are multiple reasons; unfortunately, 
the MIB standard was designed during the 
time when serial communications via RS-232 
was the norm; there were no Universal Serial 
Bus (USB) interfaces or convenient wireless 
devices (Wi-Fi or Bluetooth) at the bedside. 
Furthermore, each vendor of bedside devices 
and ICU data management systems would 
like to be the “data integrator” (for a price) 
and thus has little incentive to adhere to 
standards that would allow other vendors to 
compete for the integrator role. The business 
model apparently has not worked (Kennelly 
and Gardner 1997; Mathews and Pronovost 
2011). 

In spite of the lack of interface standards, 
the group at Intermountain Healthcare, LDS 
Hospital (Salt Lake City, Utah) has been 
actively interfacing ventilators, IV pumps, and 
similar devices for over three decades (Dalto 
et al. 1997; Vawdrey et al. 2007). Another 
effort is The Medical Device “Plug-and- 
Play” Interoperability Program at Partners 
HealthCare (Boston, Massachusetts) — 
> http://www.mdpnp.org. The open source 
result is OpenICE, implementation frame- 
work of an Integrated Clinical Environment. 


21.5.5 Challenges of Bedside 
Monitor Alarms 


Care of the critically ill is complex and chal- 
lenging. Most of these patients have medical 
problems or injuries that are life threatening. 
They might have heart problems that within 
minutes could result in sudden death, or they 
might have breathing problems that require 
mechanical ventilation to maintain life. As a 
consequence, each of these situations requires 
intense minute-by-minute observation with 
real-time, continuous physiologic monitor- 
ing. For those conditions, the requirement for 
record keeping, monitoring, and alarming is 
intense. 

There are clear expectation that bedside 
physiologic monitors, ventilators, IV pumps, 
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and similar devices attached to patients 
should provide true and valid alarms and that 
caregivers will be promptly notified and pro- 
vide the needed care immediately for those 
patients (Kowalczyk 2011). On the other 
hand, a report from the New England Journal 
of Medicine outlines 24 electronic require- 
ments for classification of a hospital as hav- 
ing a comprehensive electronic record system 
(Jha et al. 2009), yet recording of data from 
bedside physiologic monitoring systems with 
their alarming systems and data gathering 
from other bedside devices such as ventilators 
and IV pumps were not even mentioned. 

So, currently there is a curious and inex- 
plicable set of expectations being generated 
for care of the critically ill patients. As a con- 
sequence, there are niche vendors who have 
built their own data gathering and recording 
systems and nurse charting systems; in some 
cases these systems include simple interfaces 
to allow them to acquire laboratory data and 
perhaps data from the administrative admis- 
sions process. They may even include bedside 
computers or displays to allow care givers to 
have access to such things as radiographic 
images, dictated reports, and others. However, 
these systems are stand-alone devices and do 
not typically provide interfaces to transmit 
their physiologic data to the hospital’s EMR. 

In the past, the number of physiologic 
signals that can and are being monitored has 
grown. With each signal and derived param- 
eter that is added there is typically a high and 
low alarm added to warn the clinical staff of 
actual or impending patient crisis. Alarms may 
be highlighted on the bedside monitor’s screen 
by using a color change or flashing indicators. 
Most alarms also generate a sound. 

Imhoff and Kuhls (2006) noted from 1.6 
to 14.6 alarms for each ICU patient each 
hour; up to 90% of those alarms were false. 
Alarm overload is clearly a significant issue 
in ICU monitoring; from clinical informatics 
professionals working in the ICU is needed 
to minimize the number of false alarms. Just 
noting the titles of several editorials and arti- 
cles should be informative: 

1. Alarms in the intensive care unit: How can 
the number of false alarms be reduced? 

(Chambrin 2001) 
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2. Monitoring the monitors — beyond risk 
management (Thompson and Mahajan 
2006) 

3. Alarms and human behavior: Implications 
for medical alarms (Edworthy and Hellier 
2006) 

4. Alarms in the intensive care unit: Too 
much of a good thing is dangerous: Is it 
time to add some intelligence to alarms? 
(Blum and Tremper 2010) 

5. Intensive care unit alarms — How many do 
we need? (Siebig et al. 2010) 


O Figures 21.7 and 21.8 give examples of the 
complexity of determining whether an alarm is 
true or false based on two life-threatening con- 
ditions. Alarms for ventricular tachycardia are 
shown in @ Fig. 21.7. @ Figure 21.7a shows 
a true ventricular tachycardia alarm condi- 
tion while @ Fig. 21.7b shows a false ventric- 
ular tachycardia condition. @ Figure 21.7b 
has only a few seconds of ECG artifact, which 
causes the bedside monitors’ alarm detection 
system to issues an alarm. 

Arterial hypotension alarms are shown 
in O Fig. 21.8. © Figure 21.8a shows a true 
arterial hypotension alarm condition while 
O Figure 21.8b shows a false condition. If 
the monitor or human observer only watches 
the ABP signal, the two conditions appear 
similar. However, by simultaneously follow- 
ing the ECG signal, the human observer will 
note that for some unknown reason the ABP 
signal displays a false representation of the 
patient’s pulsatile BP. The unknown reason is 
likely related to the catheter and tubing parts 
of the arterial monitoring system. Alerting 
the clinical staff to examine the catheter and 
transducer system is certainly appropriate. 

Biomedical informatics specialists, bio- 
medical engineers, and bedside monitor ven- 
dors have recently renewed their efforts to 
reduce false alarms and improve the relevance 
of existing alarms. Most of the false alarms 
are caused by noise or artifacts in the primary 
signals. To help minimize these problems, two 
examples are used to illustrate the challenges 
and opportunities to improve bedside alarms. 
1. After observing over 200 hours of alarms 

from bedside monitors and ventilators in 

an adult medical ICU, Gorges and his 
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O Fig. 21.7 Ventricular tachycardia alarm conditions. 
a A true alarm; note that the ventricle is still pumping 
but that the arterial pulse pressure is dramatically 


colleagues (2009) used the data recorded to 
recommend a two-step process that would 
dramatically reduce the number of false 
alarms. The first step was to add a 19 sec- 
ond delay into the alarming system. That 
step by itself reduced the number of alarms 
by 67%. They then noted that by having 
some method for automatically detecting 
when a patient was being suctioned, repo- 
sitioned, given oral care or being washed, 
there would be a further 13% reduction of 
ineffective alarms. By using these just these 
two methods, almost 80% of the false 
alarms could be eliminated. 

2. Using multiple signals to derive identical 
measures should be an effective method of 
reducing false alarms (Herasevich et al. 
2013). As will be noted in Ø Fig. 21.7, 
there are five signals that can be used to 


=. = N TT 


reduced. b A false alarm caused by artifact in the ECG 
signal; note the ABP waveform is stable during the same 
time interval. ABP indicates arterial blood Pressure, 
ECG electrocardiogram, v-tach ventricular tachycardia 


derive HR: ECG 1, ECG 2, ECG 3, ABP, 
and pulse oximeter. Since the probability 
of all those signals having an artifact is 
smaller than any single physiologic signal, 
smart alarm algorithms that are more 
robust should be possible. Two investiga- 
tors have developed and tested such algo- 
rithms (Zong et al. 2004; Poon 2005). The 
Zong pressure alarm algorithm reduced 
false alarms from 26.8% to 0.5%. Poon 
found that the usual HR and rhythm alarm 
system produced 65.4% false alarms, while 
an algorithm that integrated multiple sig- 
nals generated only 31.5% false alarms. 
Two other findings from the Poon study 
were also encouraging. By merely delaying 
the alarms by 10 seconds there was a 60% 
reduction in false alarms. In addition, he 
found that default settings for high and 
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O Fig. 21.8 Arterial hypotension alarm conditions. a 
A true alarm; note the normal ventricular beats fol- 
lowed by ventricular fibrillation that renders the heart 
unable to generate an effective blood pressure. b A false 


low HR alarms were not optimized to pre- 
vent false alarms. For example, if a patient 
had an average HR of 65 beats per minute 
and the default low HR alarm was 60 beats 
per minute, there was an increased likeli- 
hood of false low HR alarms. Several bed- 
side monitor vendors now provide these 
more sophisticated alarm algorithms in 
their newest monitors. 


Still other informatics specialists have found 
different strategies to provide more accurate 
ABP and cardiac arrhythmia alarm rates 
(Aboukhalil et al. 2008; Zhang and Szolovitz 
2008). Having electronic archives of physi- 
ologic waveforms that are publically available 
should permit development of even better 
smart alarm algorithms, which should lead 
to a reduced number of false alarms gener- 
ated by bedside monitors (Saeed et al. 2011; 
Burykin et al. 2011). 
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alarm; note for some nonphysiologic reason the arterial 
pressure signal loses its pulsatile characteristics and then 
eventually it returns. ABP indicates arterial blood pres- 
sure, ECG electrocardiogram 


21.5.6 Biomedical Sensors 


One hundred years after the introduction of 
traditional vital signs they are stillin use despite 
them not describing what exactly happens with 
patients. Current noninvasive technologies for 
rapid physiologic function do exist, but have 
not replaced traditional measurements. Heart 
function could be measured by stroke volume, 
arterial oxygenation by pulse oximetry, and 
ventilation using capnography. 


21.5.7 Strategies for Incorporating 
Bedside Monitoring Data 
into an Integrated Hospital 
EMR 


Three general strategies are currently used to 
transfer bedside monitoring data into the hos- 
pital’s EMR. 
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The first is the simplest and still widely 
in use: nurses observe data presented on the 
bedside monitor screen and manually key-in 
the observations into an integrated EMR. As 
simple as this may be to implement, such 
manual data collection strategy is inefficient, 
error prone up to 15% when documented on 
paper and then typed into EMR (Wager et al. 
2010), and does not collect representative data 
gathered by the bedside monitor. 

The second strategy used by ICU infor- 
mation systems is to acquire vital sign data 
directly from the bedside monitoring system’s 
network by using an HL7 feed (see > Chap. 
7). The information is automatically gathered 
by the ICU information system and nurses 
have the option of either accepting or modify- 
ing the data. In typical clinical settings, nurses 
perform the selection and transfer of bedside 
monitoring data from the ICU information 
system to the EMR about once an hour. These 
ICU information systems typically retain the 
high frequency bedside monitoring data and 
can achieve near-real-time computerized deci- 
sion support. In many cases, the nurse’s notes 
are also entered into the ICU information 
system — generally once per shift — and some 
summary vital sign information may find its 
way into those notes. Physician progress notes 
are also entered into ICU information sys- 
tems in a similar fashion. Unfortunately, data 
in the ICU information system may never find 
its way into the hospital’s EMR. For these sys- 
tems, the ICU data are usually archived sepa- 
rately. As a consequence, these data cannot 
be used for real-time decision making by the 
hospital’s EMR. 

The third strategy is to have the ICU infor- 
mation system or the hospital’s EMR system 
automatically transfer vital sign data from the 
bedside monitoring system to the EMR. Most 
systems that automatically gather data with 
this strategy take a median of the vital sign 
data over a 15 minute time interval to smooth 
the data (Warner et al. 1968; Gardner et al. 
1991; Vawdrey et al. 2007). This strategy pro- 
vides real-time data for computations and 
computerized decision support for the hospi- 
tal’s EMR and is the preferred strategy. 

There are opportunities to improve the 
automated data gathering from bedside moni- 


tors, especially if the false alarm rate can be 
minimized. In addition to acquiring 15-minute 
median data, one may wish to detect bedside 
alarms and record data in the intervals just 
before and just after these alarms. Thus, there 
is still opportunity for informaticians to make 
major improvements in both data recording 
and bedside monitoring alarms. 


21.6 Monitoring and Advanced 
Information Management 
in ICUs 


21.6.1 Early Pioneering in ICU 


Systems 


As electronic information gathered in ICUs 
and hospitals started growing, problems with 
bedside monitors, alarms, and data integra- 
tions become more apparent. The next step in 
information management began 50 years ago 
at LDS Hospital where a team of people devel- 
oped what was known as the HELP (Health 
Evaluation Through Logical Processing) 
System (Pryor et al. 1983; Kuperman et al. 
1991; Gardner et al. 1999). 

HELP was the first hospital informa- 
tion system to collect patient data needed 
for clinical decision making and at the same 
time incorporate a medical knowledge base 
and inference engine to assist the clinician 
in making decisions. The HELP system has 
been operational at LDS Hospital since 1967. 
The system initially supported a heart cath- 
eterization laboratory and a postoperative 
open heart ICU. Initially only physiologic 
data were acquired from the bedside moni- 
tors. Nursing note charting promptly followed 
with ability to chart medications ordered and 
given, including IV drip rates. Soon, it became 
apparent that much of the data needed to care 
for these critically ill patients came from the 
clinical laboratory and other sites such as 
radiology. As a consequence, multiple mod- 
ules were added to the HELP system to sup- 


port the ICUs. 
Clinical decision making in the ICU is 
complex. Physicians, nurses, respiratory 


therapists, pharmacists, and others evaluate 
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O Table 21.2 Data used for ICU decision 
making and their sources 


Data types % Data source 

Clinical & 42 Laboratory 

blood-gas interfaces 

laboratories 

Drug I/O IV 22 Nurse charting & 
IV pump interface 

Observations 21 Nurse charting & 
physician notes 

Physiologic data 13 Bedside monitor 
interface 

Other 2 


Adapted from Bradshaw et al. (1984) (See 
O Fig. 21.3) 

Abbreviations: JCU intensive care unit, I/O 
Intake and Output, /V intravenous 


each patient using different types and modali- 
ties of data. In 1984, a study was performed 
to identify what data were used by the criti- 
cal care team to make clinical care decisions 
(Bradshaw et al. 1984). The investigators were 
surprised to find that data from the physi- 
ologic monitor accounted for only 13% of 
the data used to make treatment decisions. 
© Table 21.2 outlines the data types evaluated 
with the percentage of time that each type of 
data was used to make a care decision. Many 
of the data came from automated instruments 
in the laboratory, but a large number came 
from nurse observations and actions that 
were manually charted into the computerized 
record. 

However, as described earlier, the physi- 
ologic monitor serves a very crucial func- 
tion during life-threatening situations such 
as cardiac arrest. The observations showed 
the crucial need for a fast and reliable labora- 
tory interface and the importance of data that 
came from nurse charting. Knowing what 
drugs the patient was receiving, when those 
drugs were given, and the types and adminis- 
tration rates of IV medications were crucial to 
clinical decision making. 
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21.6.2 Recent Advances in ICU 
Clinical Management 
Systems 


We are now almost two decades after the 
Institute of Medicine’s, “To Err is Human” 
report (Kohn et al. 2000) and still medical 
error remains one of the leading causes of poor 
outcome of hospitalized patients (Landrigan 
et al. 2010). Despite high hopes, the current 
generation of EMR has made things worse, 
particularly in acute settings where infor- 
mation overload poses a major challenge to 
timely, evidence-based patient care (Han et al. 
2005; Pickering et al. 2012). When caring for 
unstable patients, providers often have only a 
few minutes to wade through medical records 
before making critically important decisions. 
Clinical decision-making is often hindered by 
patient information that is difficult to access 
and use in modern EMRsg, which increases the 
potential for error and delays treatment. 

Beginning in 2009 a group of clinicians, 
clinical informatics specialists, and computer 
programmers at Mayo Clinic (Rochester, 
Minnesota) developed Ambient Warning 
and Response Evaluation System (AWARE) 
system to address those challenges. AWARE 
is a data assimilation, communication, work- 
flow, and decision support tool that has the 
ability to enhance EMR experience (Ahmed 
et al. 2011). The system is configured to allow 
a more rapid assessment of patient’s clinical 
data, freeing time to focus on other important 
patient needs. AWARE has been designed, 
tested, and validated to foster the best clini- 
cal practice by specifically addressing the 
key clinical components known to improve 
patient outcomes (Olchanski et al. 2017). 

A suite of tools is designed to address 
patients’ needs and focus on acute patient- 
centered problems rather than organized 
around specific database or clinical services. 

The main components of AWARE could 
be divided to five domains (@ Fig. 21.9): 

1. Dashboards/viewers. Included are Multi- 
patient (MPV) and Single-patient viewers 

(SPV) to reduce information overload by 
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AWARE 


Dashboards/viewers 


“Sniffers 
or smart alerts” 


Multipatient Single patient 
viewer viewer 


Rounding tool 
(Checklist) 


e Time sensitive e Group level 
clinical population 
surveillance management 


e Pertinent clinical 
information 


e Structured 
clinical 
assessment 


Communication tools 


Administrative 


dashboard 
e Resource + Transfer 
planning, Quality essential 
improvement information at a 
glance focused 
on patient 
problems 


e Links provider 
and patients tasks 


Task list and 
whiteboard 


e Shared list of 


e Outside of 
clinical note 


e One stop 
communication 


O Fig. 21.9 Essential elements and logical structure of the Ambient Warning and Response Evaluation System 


(AWARE) clinical management system 


facilitating real-time access to key informa- 
tion needed for timely medical and interven- 
tional decision making at the point of care. 

2. Sniffers or rule-based smart alerts. 
Continuously survey both the patient con- 
dition and provider actions detecting 
potential mismatches and preventing 
potential errors before they occur. 

3. Communication tools. AWARE white- 
board, task-list, readiness for discharge and 
claim patient functions facilitate communi- 
cation between team members and during 
transitions of care, thereby preventing com- 
mon errors of communication omission. 

4. Checklist/ rounding tool. Designed to 
assists providers in developing and execut- 
ing a coordinated daily plan of care. The 
easy-to-use interface minimizes clerical 
burden while simultaneously assuring 
adherence to patient-centered best care 
practices and regulatory requirements. 

5. Administrative dashboard. Feedback and 
reporting tool enables easy access to quality 
improvement metrics and patient outcomes 
for administrators and oversight groups, 
which facilitates rapid-cycle management 
changes based on continuous feedback. 


21.6.3 Acquiring the Data: Quality 
and Timeliness 


A fundamental part of any computerized 
decision support system, just as with any 
human clinical decision support system, is 
the acquisition of data. Clinicians develop 
observational, interpersonal, and technical 
skills as they collect accurate patient data. 
Likewise, a computerized decision support 
system depends on high-quality, timely data. 
In many ICUs today, much medical data still 
continues to be entered into computerized 
patient records as scanned PDF files or in a 
structured and coded formats while others 
(such as the progress note) are be stored in a 
free text format (either handwritten or typed) 
(Celi et al. 2001; Pickering et al. 2010; Ahmed 
et al. 2011; Hripcsak et al. 2011). As noted 
in » Chap. 8, natural language processing 
of free text to obtain coded and structured 
information has seen great improvement over 
the past decades; however, the process is still 
far from perfect and all processing is delayed 
until data is entered to EMR. Such delays is 
limiting factor to use such free text data in real 
time monitoring systems. 


Patient Monitoring Systems 


As designers of clinical monitoring sys- 
tems look at acquiring and entering clinical 
data they must decide: 

1. Who should enter the data: automated 
acquisition from electronic instruments 
(such as the bedside monitor) versus man- 
ual entry using a keyboard, bar code 
reader, touch screen, voice input, or some 
similar method. 

2. When to enter the data: accurate ICU deci- 
sion making often requires data to be 
acquired in a timely manner, sometimes 
within 1 minute of an event to make a 
timely decision. 

3. Where to enter the data: this automated 
data will naturally be acquired from the 
bedside monitor or instrument located at 
the bedside; manual data entry should 
optimally occur at the bedside as well. 

4. How data should be collected: methods 
should take into account the occurrence of 
artifacts in the patient data; many EMR 
systems allow nurses to review and vali- 
date bedside vital sign data minutes to 
hours after they are collected, although 
this process does not meet the requirement 
for real-time data collection and can lead 
to “human” and computerized decision 
support errors (Nelson et al. 2005; Vawdrey 
et al. 2007) 

5. How much data to collect: this is particu- 
larly an issue with systems such as bedside 
monitors that can generate an HR, systolic 
and diastolic BP value for each heartbeat, 
resulting in hundreds of thousands of val- 
ues per day; except for special situations to 
use in automatic decision support systems, 
the collection of such intensive data in 
regular EMR is inappropriate. 


The process of developing and implementing 
the systems for acquiring data involves not 
only technology, but adapting that technol- 
ogy to the human users; training those users 
to properly use the new system is complex 
and difficult. Consequently, developers and 
adopters of such systems should plan for and 
be prepared for challenges that may take years 
to implement and optimize system. 

Despite modern protocols like HL7 and 
FHIR there are still major problems with 
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acquiring ICU data to EMR either automati- 
cally or manually (Gardner et al. 1989, 1991; 
Dalto et al. 1997; Nelson et al. 2005; Vawdrey 
et al. 2007). Data from bedside monitors, 
ventilators and IV pumps should be acquired 
automatically with a real-time technology. 
Data thus acquired is timely and by appro- 
priate signal processing methods can be vali- 
dated (Dalto et al. 1997; Vawdrey et al. 2007; 
Ahmed et al. 2011; Lilly et al. 2011). Changes 
in ventilator settings such as FIO, may only 
be present for a few minutes, but blood-gas 
measurements taken during that time interval 
will be misinterpreted if only manual elec- 
tronic charting is used. Similar interpretation 
errors were found to occur with IV pump drip 
rate charting when manual charting methods 
were compared to automated acquisition. 
Gathering accurate, representative, and timely 
computerized ICU data requires attention to 
detail and careful planning to assure its qual- 
ity. Bedside charting systems have the ability 
to capture near real-time data from bedside 
devices, but the presentation layer usually 
showed normalized and averaged values. 


21.6.4 Presentation of Data 


Once data have been collected, their qual- 
ity verified, and the results stored, one must 
decide how the data should be presented. 
Currently, most data are presented on a col- 
orful screen. However, some care givers will 
still prefer a paper copy. Still others will prefer 
to view these reports on their smart phones 
or other mobile devices. For ICU patients, 
it is clear that specialized reports must be 
developed. The traditional method of seg- 
mented reporting (separate reports for labo- 
ratory data, vital signs reports, medication 
lists, etc.) has proven inadequate (Clemmer 
2004; Ahmed et al. 2011). The ICU group at 
Mayo Clinic has developed and tested an ICU 
rounding tool (Pickering et al. 2010). Thus, 
one can see there is value in the integration 
of and presentation of data. As of this writ- 
ing, there is probably not a single ICU sum- 
mary report that will satisfy all ICU users. 
Thus, such reports will require special effort 
for each institution and perhaps even each 
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ICU within that institution. For example, the 
report generated for a thoracic ICU is unlikely 
to be identical to that required by the neona- 
tal ICU. Accomplishing such tasks typically 
requires 6 months or more, with continuous 
ongoing effort to update the report as new 
data are acquired and caregivers needs evolve. 


21.6.5 Establishing the Decision 
Rules and Knowledge Base 


Deciding on the decision rules that should 
be installed in a computerized ICU decision 
support system is difficult. Health care is cur- 
rently driven by implementing evidence-based 
protocols. However, few of these protocols 
have been computerized. The long-standing 
work with the HELP system and some excit- 
ing work done at Mayo Clinic and at the 
University of Massachusetts are exceptions 
(Clemmer 2004; Morris 2000; East et al. 1992; 
Ahmed et al. 2011; Lilly et al. 2011). Using a 
consensus process to develop treatment deci- 
sions is essential. However, generating a con- 
sensus is a tedious, difficult, and slow process. 
At the moment, the consensus process involv- 
ing all the clinical caregivers in the ICU is the 
best approach, as rules developed by indi- 
viduals are often not widely accepted or used. 
However, in some departments there may be 
trusted clinical leaders who become the “local 
expert.” Developing the rules for clinical deci- 
sion support is complex and those rules are 
always subject to change. Development of 
appropriate rules can take up to 6 months 
and the rules will need to be continuously 
reviewed and updated (Gardner 2004; Ahmed 
et al. 2011; Lilly et al. 2011). 


21.6.6 Clinical Charting Systems: 
Nurses, Pharmacists, 
Physicians, Therapists 


The major portion (43%) of the data used 
at LDS Hospital for decision-making dur- 
ing ICU rounds came from clinical notes and 
data charted by nurses and other clinicians 
(Bradshaw et al. 1984). In a more recent study in 


the ICU at Mayo Clinic, the team found that as 
they developed AWARE system, they required 
similar data content (Pickering et al. 2010). 

At LDS Hospital, the computerized nurse 
charting module allows nurses to enter patient 
care tasks, qualitative and quantitative data, 
and a patient’s response to therapy (Willson 
1994; Willson et al. 1994; Nelson et al. 2005). In 
addition, nurses interact with a pharmacy mod- 
ule to chart all given medications including IV 
drip rates (Pryor 1989; Kuperman et al. 1991). 

Soon after the nurse charting was imple- 
mented at LDS Hospital, respiratory thera- 
pists chose to enter their qualitative and 
quantitative ventilator data and care given to 
patients (Andrews et al. 1985; Gardner 2004). 
The motivation for the online charting was to 
provide clinicians with access to timely and 
accurate data to make patient care decisions. 
In addition, these data could be used to imple- 
ment protocol-controlled ventilator weaning 
systems (East et al. 1992; Morris 2001). 

To optimize the performance of routine 
care deemed essential for ICU patient recov- 
ery, computerized reminders were generated 
(Oniki et al. 2003). For example, 1 of the 
goals of the reminders was to provide assis- 
tance in determining the required level of 
sedation while avoiding oversedation. By pro- 
viding the computerized reminders to nurses, 
charting deficiencies were reduced by 40% 
and the number of deficiencies at the end of 
the shift was improved. To optimize care pro- 
vided by the reminders, real-time charting was 
required. However, during a quality improve- 
ment process, it was determined that 29% of 
the medication errors that should have been 
prevented by online nurse charting were still 
present. A careful evaluation revealed that the 
actual nurse charting workflow was different 
than that envisioned by the system planners. 
Instead of charting the given medication using 
a bedside terminal, nurses administered the 
medication and then at some later time, at the 
central nursing station, charted that the medi- 
cation had been given. Consequently, errors 
were occurring. After careful training and 
feedback with the nursing staff, the real-time 
charting rate increased from 40% to 75% and 
remained at that level a year later. This exam- 
ple shows that having computerized decision 
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support systems in place without having real- 
time data entry was ineffective. Conceptually, 
one could make the same logical observation 
if the ICU were operating as a tele-ICU, as 
discussed later in this chapter. 

For generations, nurses and other caregiv- 
ers who have used conventional paper records 
have had the notion that if their paper chart 
was up-to-date at the end of the shift then 
they had met their requirements for good 
patient care. Clearly, the above example shows 
that such a strategy is flawed. However, it is 
interesting that even today reports are being 
made about charting and use of data for end- 
of-shift nursing care exchanges and patient 
handovers, suggesting that the EMR still may 
not be real-time (Hripcsak et al. 2011; Collins 
2011). Collins (2011) found that clinicians 
preferred oral communications compared 
to EMR documentation and stated that the 
perceptions that the EMR was a shift behind 
might have only been a manifestation of 
the lack of real-time charting by nurses and 
acquisition of real-time data from bedside 
monitors in their ICU. 

An early survey of nurses and physicians 
use of the HELP clinical expert system was 
conducted in 1994 (Gardner and Lundsgaarde 
1994). The investigators were encouraged by a 
positive response from both nurse and physi- 
cian users who appreciated having the data 
available with interpretation and alerting fea- 
tures provided by the HELP system. At the 
time the survey was conducted, ICU charting 
and decision support was a major feature of 
the HELP system. It is exciting to note that 
other institutions have begun to assess factors 
related to acceptance of an EMR in critical 
care (Carayon et al. 2011). The Carayon study 
showed that ease of use, as well as data pre- 
sentation strategies, were major determinants 
of acceptability of their system. 


21.6.7 Automated Data Acquisition 
From All Bedside Devices 


Computer systems that support ICU patients 
are tightly integrated and data are auto- 
matically gathered and stored, primarily in a 
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coded format so that real-time computerized 
decision support can be used. B Figure 21.10 
shows a schematic of the HELP system at LDS 
Hospital and Ø Fig. 21.11 shows AWARE 
system at Mayo Clinic as examples of such 
systems. Based on the data available to ICU 
management systems from these multiple 
data sources, its computerized decision sup- 
port system makes and displays suggestions 
for optimum care for the specific problems 
such as sepsis and acute respiratory distress 
syndrome. The system provides audible and 
visual alerts for life-threatening situations. In 
addition, the system organizes and reports the 
large amount of data so that the medical team 
can make prompt and reliable treatment deci- 
sions. The patient’s physicians are automati- 
cally alerted about life-threatening laboratory 
and other findings. 

Much of the information required for 
ICU patient care comes from underlying 
laboratories and devices that automatically 
acquire data. In the upper right hand cor- 
ner of the HELP system diagram, data from 
the ventilator, IV pumps, and the bedside 
monitor are noted. While most of the physi- 
ologic bedside monitor vendors now acquire 
ECG, BP, and pulse oximetry data, they do 
not provide access to data from ventilators 
or information from IV pumps. As a con- 
sequence, data from these devices must be 
obtained by developing hardware and soft- 
ware interfaces (Gardner et al. 1991; Dalto 
et al. 1997; Kennelly and Gardner 1997; 
Vawdrey et al. 2007). Based on those studies, 
it is clear that automatically collecting data 
from all of these devices in real-time is more 
timely and accurate than manually charted 
data collected by nurses or respiratory thera- 
pists. Although data from these devices can 
contain artifacts, methods for minimizing 
those artifacts have been implemented in 
operational systems. 

Initially the MIB standard CEN ISO/ 
IEEE 11073 was designed to help gather data 
from bedside devices, but has not been widely 
implemented (Mathews and Pronovost 2011). 
Fortunately, battery power and wireless com- 
munications with IV pumps are now widely 
available. By using wireless technology, inter- 
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O Fig. 21.10 Diagram of HELP the System Used by 
Intermountain Healthcare’s Hospitals. At the center is 
the database for the electronic medical record (EMR). 
Data from a wide variety of clinical and administrative 
sources flow into the EMR. As the data flows into the 
EMR, the Data Driver capabilities of the HELP Deci- 
sion Support Engine (red circle) are activated. In addi- 


faces with the IV pumps are fast, mobile, and 
easy for nurses to implement and tangled wires 
are no longer an issue. In addition, communi- 
cations with device can be carried throughout 
the hospital — in the OR, while on transport, 
and in the ICU. 

Although early studies of nurses and 
therapists showed that computerized charting 
took longer than manual charting, it is almost 
certain with automated acquisition available 
today that charting takes less time and is more 
accurate. As a consequence, in institutions 
that have historically collected IV pump and 
ventilator data automatically, there is a com- 
mitment to collect data from every bedside 
monitoring device. These include measures 
of urine output, fluid drainage, and similar 
measures. 


tion, Time Driver decisions are also made. Shown 
schematically, in the upper right hand corner of the dia- 
gram are blocks representing ICU bedside devices 
including the physiologic monitor, ventilator, IV pumps 
and barcode scanner. ECG indicates electrocardiogram, 
HELP Health Evaluation Through Logical Processing, 
ICU intensive care unit, IV intravenous 


21.6.8 Rounding Process: 
Single Patient Viewer 


The care of critically ill patients in the ICU 
requires collaboration among a diverse team 
of very competent caregivers to achieve the 
best care (Clemmer et al. 1998). The team- 
work and communications is required in this 
complex care process. 

The rounds activity at LDS Hospital 
is an example of the collaborative process. 
O Figure 21.12 shows the clinical care team 
during rounds. There are physicians, house 
officers, advanced practice clinicians, nurses, 
pharmacists, respiratory therapists, dieticians, 
case managers, and others who gather each 
day to assess each patient and make key care 
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AWARE 


O Fig. 21.11 Diagram of AWARE system developed 
and used at Mayo Clinic. The core component is middle- 
ware database that cached data from hospital EMR sys- 
tems (left). It utilized number of approaches such as DB 
stored procedures and HL7. Synthesis is Mayo Clinic 
home-grown API layer and DB. Data from discharged 
patients archived and stored in Datamart repository. 
Historical data is used for administrative reporting and 
cohort identification for research. The alerting module 
(right) is used for clinical sniffers and research enroll- 


decisions. The rounds leader is usually a phy- 
sician, but each team member is considered 
an equal partner, providing key information 
(most of it stored in the computer record) and 
given the opportunity to discuss their inter- 
pretation and make recommendations about 
the patient’s care. 

Over decades, the social process of con- 
ducting these rounds has created a very open 
and cooperative environment. The purpose of 
rounds is to reduce errors from human fac- 
tors, to give structure to the evaluation, and 
to make sure all sides of the decision process 
are considered as each member considers the 
decisions from their point of view. 
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Open Common Architecture 


Alerts (sniffers) 


ment in clinical trials. On the top are schematically 
shown clinical applications or viewers that use data from 
middleware to generate user interfaces. ADT indicates; 
APL; apps applications, AWARE Ambient Warning and 
Response Evaluation System, EMR DB Electronic 
Medical Record Database, HL7 Health Level Seven, 
ICU intensive care unit, LIS Laboratory Information 
System, MCHS Mayo Clinic Health System, OR oper- 
ating room, PACU Post Anesthesia Care Unit, SIRS 
Surgical Information System 


The information from the computer sys- 
tem is organized to support the process. The 
computerized record is the patient record. 
Information from other sources such as radio- 
graphic images and free-text reports are also 
readily available (Gurses and Xiao 2006). 

As a single patient viewer, AWARE 
extracts data relevant to the treatment of 
ICU patients and presents it to the provider 
in a systems-based information package 
(O Fig. 21.13). AWARE content has been 
selected through the systematic observation 
of frontline provider information needs and 
profiling of provider data utilization patterns. 
The user interface has been optimized to sin- 
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| Fig. 21.12 ICU Rounds room at LDS Hospital in 
Salt Lake City. The computerized ICU rounds report is 
displayed by a projector on the wall to physicians, a nurse 
practitioner, medical students, a respiratory therapist, a 
pharmacist, and a patient’s family member. An important 
laboratory result is highlighted in red by the rounds direc- 
tor. Note several laptops and paper notes used by each of 
the participants. ICU indicates intensive care unit 


EB nt nen Mann Sy EG Wi 


~~ Current Organ 


Historical data (prior 
to hospitalization) 


| Fig. 21.13 Overview of data on the AWARE single 
patient viewer organized around human organs and sys- 
tems. Each block represents four elements. Reading 
from left the key organizational elements are clinical 
context pulled from the patient problem list, procedure, 
medications, and consults lists prior to the current hos- 
pitalization. System identifying icons with color-coded 


Provider 
Actions/support 


gle screen without scrollbars and can be used 
an enhancement to the bedside monitor dur- 
ing treating critically ill patients. 

User interface and organization of data 
elements on the screen was determined by 
considering how experts incorporate infor- 
mation into decision-making mental models. 
Reference ranges for laboratory abnormali- 
ties in AWARE are adjusted for critically ill 
patients based on expert consensus. All this 
information about reference ranges, alerts, 
and type of information represented on the 
interface is embedded in AWARE rules. It is 
part of DB and does not utilize any third- 
party rules engine. This approach decreased 
false-positive alerts without affecting the frac- 
tion of false-negative alerts (unchanged sensi- 
tivity and negative predictive value) (Kilickaya 
et al. ). 


Status of relevant 
investigation 


status (red, required urgent intervention; yellow, abnor- 
mality; white, normal physiology and investigations). 
Middle top is physiologic status of organ displays cur- 
rent values for key variables. Middle bottom is organ 
supports, displays the critical care interventions which 
are supporting the current physiologic status. Right part 
of the block displays investigations 
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The popularization of checklists use in 
the clinical medicine began with the land- 
mark publication by Dr. Atul Gawande team 
(Haynes et al. 2009). It has been shown to 
reduce errors and health care costs, increase 
compliance with evidence-based practice, and 
ultimately improve outcomes in critically ill 
patients (Weiss et al. 2011). 

The smart checklist is a component of 
AWARE (B Fig. 21.14). The system auto- 
matically detects patient characteristics 
to configure a patient’s specific checklist. 
For example, if a patient is not on the ven- 
tilator, no ventilator-related questions are 
asked. In simulation, the study checklist sig- 
nificantly reduced provider workload and 
errors (Thongprayoon et al. 2016). Also, use 
of the checklist in the ICU was associated 
with increased number of occupational ther- 
apy/physical therapy consults in critically ill 
patients (Ali et al. 2017). 
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21.6.9 Collaborative Process, ICU 
Change-of-Shift, 
and Handover Issues 


Cognitive scientists have taken an interest in, 
and have studied, the dynamic and distrib- 
uted work environment in critical care medi- 
cine (Patel and Cohen 2008; Patel et al. 2008; 
Ahmed et al. 2011). They have studied issues 
such as provider task load, errors of cogni- 
tion, and performance of clinicians involved 
in these complex tasks. The change-of-shift 
and handover times are especially critical and 
require complex exchanges of information 
that must occur rapidly and efficiently. These 
investigators have found that errors can occur 
during this time because of corruption of 
information and a failure to transfer crucial 
care facts. Having the majority of the patient 
record in electronic form and having that data 
timely and accurate should allow optimiza- 
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O Fig. 21.14 Overview of AWARE checklist. It is a 
tool to aid during critical care rounds by helping apply 
best evidence care for every patient, generate meaningful 


clinical notes, and collect data for precise administrative 
and quality reports. AWARE indicates Ambient Warn- 
ing and Response Evaluation System 
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tion of computerized decision making tools 
and methods for sharing the patient data. The 
Rounds Report developed at LDS Hospital 
three decades ago and recent developments 
at Mayo Clinic provide laboratory models for 
better understanding the issues and improv- 
ing efficiency and eliminating medical errors 
for ICU patients during shift changes and 
patient handover times. The AWARE system 
has incorporated a number of tools that sup- 
port ICU change of shift and handover pro- 
cesses, including shared tasklist, claim the 
patient, and handover modules. Each tool 
is designed to decrease the number of errors 
including omission, as well as the cognitive 
load of providers. 

One of the core components of AWARE 
is the multipatient viewer that is a population 
management tool (@ Fig. 21.15). The multi- 
patient viewer shows the census and the geo- 
graphic layout of patient rooms a specific ICU 
unit. By hovering over or clicking on the icons, 
clinical information can be accessed quickly. 
Each patient box has four distinct areas 
(0 Fig. 21.16). Workflow or administrative 
area is on the top and this includes the Task 
list and Checklist icons as well as the room 
number and readiness for discharge to the 
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floor in addition to problem list and current 
medications icons. The second row is alerting 
area with a purple circle with a line across that 
gives providers immediate knowledge that a 
patient’s code status is Do not Resuscitate/ 
Do Not Intubate. The middle section of the 
box contains patient information such as the 
patient’s name, medical record number, age, 
number of days in the ICU, and teams car- 
ing for the patient. The lower section of the 
box contains the clinical information with the 
organ icons. These icons exist for the central 
nervous, cardiovascular, respiratory, gastro- 
intestinal, renal, and hematologic systems. 
There is also an icon representing infectious 
diseases with relevant data including white 
blood cell counts and microbiologic specimen 
results. Organ icons are color coded the same 
as on single patient viewer. Second from bot- 
tom row has organs support icons. All icons 
were tested for recognition and recall by clini- 
cal providers (Litell et al. 2012). By clicking 
on a patient box it will be launched into the 
Single Patient Viewer. 

The AWARE system was extensively 
tested. In cluster randomized trials the time 
spent on preround data gathering decreased 
from 12 to 9 minutes per patient before and 
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O Fig. 21.15 Overview of multipatient viewer. It is population management tool where boxes represent geograph- 


ical view of patient rooms in the care unit 
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O Fig. 21.16 Patient box 
on the multipatient viewer 
includes four groups of 

icons described in the text 


Patient 


First name 
9-999-999 


CRIT CARE MED 


C 


after AWARE implementation (Pickering 
et al. 2015). In another study, tool usage was 
associated with a 50% decrease in ICU length 
of stay, 37% in-hospital length of stay, and 
total charges for hospital stay decreased by 
30% in a post-AWARE cohort (by $43,745 
after adjusting for patient acuity and demo- 
graphics) (Olchanski et al. 2017). 


21.7 Computerized Decision 
Support and Alerting 


In addition to the alarms from bedside moni- 
tors, there are many other types of alerts and 
decision support tools that can be helpful for 
the care of hospitalized patients. A sampling 
of the types of decision support mechanism 
that have been reported is provided below 
to give the reader a sense for the breadth of 
capabilities that have been applied in intensive 
care as well as other care settings of hospitals. 
Key to the application of such computerized 
decision support tools is having access to an 
integrated, real-time, accurate, and coded 
EMR. Most of the examples noted are from 
the HELP system (Gardner et al. 1999). A 
key function of the HELP system is that the 
computerized decision support system is acti- 
vated when new patient data are added to the 
patient’s database, the process is called data- 
driven decision making. An example would be 
when the Po2 is put into the medical record 
an instruction is given to the respiratory ther- 
apist to modify the FIO2 or PEEP (Positive 
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Workflow (administrative) area 
Alerting area 


Demographic (patient) area 
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end-expiratory pressure) accordingly. Some 
functions of the HELP system, such as alerts, 
require that computerized decision support be 
activated at specific times and that process is 
called time-driven decision making. An exam- 
ple would be to remind the nurse the next 
glucose check is due when the patient is on 
an insulin drip, or instructing the computer 
to automatically calculate today’s APACHE 
(Acute Physiology, Age, Chronic Health 
Evaluation) score and update all the reports 
at 06:00 AM. 

In the past, commercially available EMR/ 
ICU systems did not have convenient methods 
for programming and execution of computer- 
ized decision support rules. However, recent 
surveys by Sitting and Wright have shown 
that more and more commercial vendor sys- 
tems have improved capability for providing 
clinical computerized decision support (Sittig 
et al. 2011; Wright et al. 2011). Once comput- 
erized decisions are made, they must be used 
to notify clinicians so that the feedback can be 
used to more effectively care for patients. The 
most common notification method is presen- 
tation on the computer screen when a clini- 
cian is interacting with the computer in some 
task such as order entry or charting. However, 
the issues of how to notify and who to notify 
are much more challenging (Tate et al. 1995; 
Shabot 1995). Further, verifying that such 
feedback results in the appropriate care is 
becoming ever more important. 

Research continues on identifying the most 
efficient and effective notification methods. 
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Just as with the false alarms generated by bed- 
side monitors, alarm feedback from computer 
systems must present timely and accurate rec- 
ommendations with a minimum number of 
false alarms. 


21.7.1 Laboratory Alerts 


During the developmental period of the 
HELP system in the 1980s, it became appar- 
ent that on occasion life-threatening labo- 
ratory results were not being acted upon 
promptly. On acute care nursing floors, the 
initial alert response time averaged from 5.1 
to 58.2 hours (Bradshaw et al. 1989). By post- 
ing alerts on computer terminals on nursing 
floors, the average response time was reduced 
to 3.6 hours. Then a flashing light, similar to 
those found on road maintenance vehicles, 
was installed on each nursing floor. The aver- 
age response time then decreased to 6 min- 
utes but the light was very annoying to the 
nursing staff (Bradshaw et al. 1989). When a 
sophisticated nurse paging system was set up 
that paged the particular nurse caring for the 
patient with the laboratory alert and required 
nurses to acknowledge the alerts the new pager 
system was equally effective and less annoying 
to other patients and staff (Tate et al. 1995). 
Similar work was done by Shabot (1995) at 
Cedars-Sinai Hospital in Los Angeles using 
a Blackberry pager. Since that time, wireless 
communications technology has improved 
dramatically and a variety of even better feed- 
back mechanisms are now available. However 
in a study by Harrison et al. (2017), the alert 
acknowledgement rate from the severe sepsis 
alert system was significantly better with tra- 
ditional paging system. 


21.7.2 Ventilator Weaning 
Management and Alarm 
System 


Weaning patients from ventilators was one 
of the first applications of a computerized 
expert system to routine patient care at LDS 
Hospital. As a result of the nurse and respira- 


tory therapist charting described earlier, it was 
possible to develop and test computerized ven- 
tilator management protocols. Patient therapy 
was controlled by protocol 95% of the time 
and 90% of the protocol instructions were 
followed by clinicians. Several of the comput- 
erized instructions not followed were due to 
ventilator charting errors. Patients cared for 
with the computerized protocol had required 
less positive pressure in the ventilator system, 
and physiologic measures were disturbed less. 
The investigators concluded that such proto- 
cols could make the ventilator weaning pro- 
cesses “less mystifying, simpler, and more 
systematic” (East et al. 1992). Since that early 
work, several other investigators have imple- 
mented similar ventilator weaning algorithms. 
In the process of implementing automated 
charting of ventilator parameters at LDS 
Hospital (Vawdrey et al. 2007), it became 
clear that critical ventilator alarms were being 
missed. As discussed earlier, alarm sounds 
emitted from ventilators were blended with 
bedside monitor alarm sounds. As a conse- 
quence, when a patient became disconnected 
from a ventilator the alarms could be missed 
(Evans et al. 2005). Once this situation was 
recognized, an enhanced notification system 
was implemented. @ Figure 21.17 illustrates a 
ventilator disconnect alarm presented on the 
patient’s bedside display and on every other 
computer display in the same ICU. The efficacy 
and user acceptance of the new alarm system 
has enhanced patient safety and allowed docu- 
mentation of this important clinical event. 


O Fig.21.17 Ventilator disconnect alarm. This alert is 
for the patient in Room E645 but it is displayed on every 
computer screen in the intensive care unit 
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21.7.3 Adverse Drug Event 
Detection and Prevention 
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21.7.4 IV Pump and Medications 


Monitoring 


Detection and prevention of adverse drug 
events (ADEs) has been a long-term goal of 
caregivers, the World Health Organization, 
and the U.S. Food and Drug Administration 
(Classen et al. 1991). Physicians, pharmacists, 
and informatics specialists at LDS Hospital 
developed a computer-based ADE monitor 
that detected a variety of triggers in the EMR 
that could indicate potential ADEs, such as 
sudden medication stop orders, medication 
antidote ordering, and specific abnormal lab- 
oratory and physiologic results. Pharmacists 
followed up on each ADE alert and each was 
verified and categorized. During an 18-month 
period, 36,653 hospitalized patients were 
monitored and 731 true ADEs occurred in 
648 patients, 701 were classified as moderate 
or severe. Only 92 of the ADEs were identi- 
fied by traditional voluntary reporting meth- 
ods. Using this knowledge, the investigators 
developed methods for preventing ADEs. An 
example is the nurse charting work of Nelson 
et al. (2005). 

Classen and colleagues (1997) followed 
up their earlier surveillance system for ADEs. 
They found that the attributable length of 
stay and costs of hospitalization for ADEs 
were substantial. If a patient had an ADE 
there was an increased length of stay of 
1.74 days, an increased cost of $2,013, and 
an increased risk of death of 1.88 (Classen 
et al. 1997). 

Even with the enhanced computerized 
methods for detecting, preventing and moni- 
toring ADEs, there is still room for improve- 
ment (Petratos et al. 2010). In a studyover 
200,000 medication alerts in an electronic 
prescribing system found more than 90% of 
drug alerts were overridden by physicians 
(Isaac et al. 2009). Critically ill patients are 
particularly susceptible to ADEs due to their 
unstable physiology, complex therapeutic 
medications, and the large percentage of IV 
medications (Hassan et al. 2010). Better sys- 
tems must be developed and implemented to 
prevent ADEs. 


IV medication administration occurs in 90% 
of hospitalized patients; virtually every ICU 
patient is connected to an IV pump to receive 
fluids, nutrients, and medications. Although 
so-called smart pumps have been developed 
to minimize errors, those pumps are not yet 
integrated with the EMR and, as a result, are 
not capable of helping to prevent IV admin- 
istration errors. Evans and associates (2010) 
at LDS Hospital have used cabled or wireless 
IV pumps integrated with the HELP system 
to enhance notification of IV pump program- 
ming errors. The medication charting sys- 
tem can detect and provide real-time alerts 
whenever an initial or potential pump rate 
programming error occurs. A set of 23 high- 
risk medications are monitored by the HELP 
system. Whenever IV pump flow rate for one 
of these medications is outside the acceptable 
range, a visual alert such as that shown in 
O Fig. 21.18 is presented on the bedside dis- 
play and on all other computer displays in the 
same ICU. Over a 2-year period, they found 
that there were alerts on 4% of the initial or 
dose rate changes or about 1.4 alerts per day. 


O Fig. 21.18 Intravenous pump alert. This alert is for 
pump 305 located in Room E601, but it is displayed on 
every computer screen in the intensive care unit 
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Of those alerts, 14% were found to have pre- 
vented potential patient harm. 

Clearly the monitoring and alerting sys- 
tem for ICU patients involves quite a different 
process and strategy than the usual bedside 
monitoring alarms. However, by having the 
integrated clinical record and the computer- 
ized decision support system available, these 
investigators have made major advances in 
minimizing ADEs and providing higher qual- 
ity patient care. 


21.8 Remote Monitoring 
and Tele-ICU 


Tele-ICU is defined as the provision of care 
to critically ill patients by health care profes- 
sionals located remotely. Tele-ICU clinicians 
use audio, video, and electronic links to assist 
the bedside caregivers in monitoring patients 
to help provide best practice and to help with 
the execution of optimized patient care plans. 
These types of systems have the potential of 
improving patient outcomes by having shorter 
response times to bedside monitor alarms 
and to abnormal laboratory values, initiating 
life-saving therapies, providing best practice 
more frequently, and providing expertise to 
smaller or remote ICUs where subspecialists 
are not readily available (Lilly et al. 2011). 
Historically, tele-ICU concepts date back to 
the mid-1980s, but it was not until the early 
2000s that there was a dramatic increase in the 
use of such systems (Breslow 2007). 
Tele-ICU has built on the concepts of com- 
puterized patient monitoring discussed earlier 
in this chapter. The real-time, EMR is fun- 
damental to making tele-ICU care practical. 
The clinical information system is one of the 
keys to allow clinicians not physically present 
in the ICU to be able to suggest appropriate 
care. Enhanced bedside data acquisition and 
alarm systems, as well as clinical decision sup- 
port systems (such as those described above) 
are required if remote clinicians are to pro- 
vide practical and effective care for patients 
located in multiple remote ICUs (Rosenfeld 
et al. 2000; Celi et al. 2001; Breslow 2007, 
and Lilly et al. 2011). @ Table 21.3 gives an 


O Table 21.3 Comparison of typical ICU care 
processes with Tele-ICU care processes 


Typical no EMR Tele-ICU 
ICU 


Bedside 
monitor alarms 


Physiologic trend alerts 


Abnormal laboratory value 
alerts 


Review of response to alerts 
Off-site team rounds 


Electronic detection of 
nonadherence 


Daily goal sheet 


Real-time auditing 
Nurse manager audits 
Team audits 


Telephone case 
review initiated 


Workstation review initiated by 
intensivists including EMR, 


by house staff imaging studies, interactive 
or affiliate audio and video of patient, 
practitioner integrated with nurse and 


respiratory therapist, and 
assessment of responses to 
therapy 


Adapted from Lilly et al. (2011) 
Abbreviation: EMR electronic medical record, 
ICU intensive care unit 


overview of the differences between a typical 
ICU with no electronic record compared with 
a tele-ICU. 

Recent findings of the impact of tele- 
ICU are encouraging and exciting. Patients 
receiving such care have lower hospital and 
ICU mortality and shorter hospital and ICU 
lengths of stay. Measures of adherence to 
best care practices are increased and compli- 
cation rates are decreased (Lilly et al. 2011). 
However, the investigators pointed out that 
they had to implement major process and cul- 
ture changes in their reengineering activities to 
make their system work (Lilly et al. 2011). An 
editorial accompanying the Lilly article out- 
lines challenges still to be studied and under- 
stood about tele-ICU (Kahn 201la). Since 
many changes were made from the typical 
ICU to the tele-ICU intervention, simply add- 
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ing better electronic data recording, electronic 
physiologic surveillance, and computerized 
decision support may have provided the same 
benefit, independent of the telemedicine fea- 
ture. Informatics specialists clearly have excit- 
ing opportunities to improve care of critically 
ill patients and answer important process and 
intervention questions. 


21.9 Predictive Alarms 
and Syndrome Surveillance 


One key factor for medical errors is infor- 
mation overload (Kohn et al. 2000). False 
alerts from electronic and monitoring sys- 
tems continue contribute to problem. Ideal 
alerts should only be issued when events are 
clinically significant, undetected by provid- 
ers, and there is opportunity for correction 
of the underlying problem (actionable alert) 
(Norris and Dawant 2001). There are a num- 
ber of approaches to reduce false alerts. 
Machine learning and surveillance methods 
have been developed to assist clinician deci- 
sion makers in the care of complex care situ- 
ations (Lee and Mark 2010). The work of Lee 
and colleagues at Harvard/Massachusetts 
Institute of Technology presents a methodol- 
ogy that has great promise (Lee et al. 2010). 
These investigators used machine learning 
to see if they could use pattern recognition 
approaches to predict impending hypoten- 
sion in ICU patients. Using the high-reso- 
lution vital sign trends from the MIMIC II 
(Multiparameter Intelligent Monitoring in 
Intensive Care) Database, they trained their 
system to predict impending hypotension. 
Although the results were not perfect, they 
were able to identify patients at higher risk 
for developing hypotensive episodes within 
the subsequent 2 hours, thus alerting busy 
clinicians to be vigilant to impending events. 
Intelligent rule-based alerts, or sniffers devel- 
oped at Mayo Clinic by Herasevich and his 
associates (2009, 2010, 2011), used near real- 
time Multidisciplinary Epidemiology and 
Translational Research in Intensive Care 
Data Mart to detect high-risk syndromes 
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in patients and to alert clinicians if therapy 
has not yet been started. These investigators 
have provided excellent recommendations 
for development and use of large databases 
to allow better understanding of the com- 
plexities of patients who are critically ill. 
Advances in machine learning techniques 
showed promising results in predicting life- 
threatening situations (Evans et al. 2015) and 
death (Johnson and Mark 2018). 


21.10 Opportunities for Future 
Development 


Throughout this chapter, we have discussed 
many challenges and opportunities that 
remain in the field of patient monitoring sys- 
tems. There are still important possibilities in 
the development of better and more effective 
bedside monitoring systems, especially in the 
area of maximizing true alarms and mini- 
mizing false alarms. Integrating clinical data 
from a broad variety of hospital and personal 
records is still challenging and important. 
Being able to apply computerized decision 
support systems to warn of life-threatening 
situations or advise care givers about opti- 
mum patient treatment strategies is still a rela- 
tively new aspect of health care. Development 
of patient care protocols and then having 
them be executable by computers, especially 
for ICU patients, is also an exciting field of 
endeavor. 

Since the early 1950s, when physicians 
began to understand control system theory, 
there has been a fascination with having con- 
trol systems that closed the loop without the 
need for any human intervention. Implantable 
defibrillators and pacemakers are examples 
closed-loop devices. The publication in 1957 
started the idea of automating mechanical 
ventilation (Saxton and Myers 1957). 

We still believe that applying informatics 
in the ICU is a “contact and team sport,” that 
you must be involved at the patient care level 
and work with the incredibly talented clinical 
teams to maximize the benefits that biomedi- 
cal informatics specialists can provide. 
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21.10.1 Value of Computerized ICU 


Care Processes 


Challenges and opportunities lie in proving 
the value of health information systems. There 
have been dramatic improvements in the adop- 
tion of EMR in recent years (Gardner RM 
2016). A review by Chaudhry and associates 
(2006) assessing the impact of health informa- 
tion technology on the quality, efficiency, and 
cost of medical care is illustrative of the chal- 
lenge. An even more sobering report was pre- 
sented by Karsh and associates (2010). These 
investigators suggest that not only is rate of 
adoption of health information technology 
low, but such technology may not improve 
quality of care or reduce costs. Introducing a 
new EMR system can lead to change in work- 
flow. There are documented increases in rev- 
enue after EMR implementation despite of 
productivity loss (less patient visits) (Howley 
et al. 2015). Thompson and colleagues (2015) 
systematic review “Impact of the Electronic 
Medical Record on Mortality, Length of Stay, 
and Cost in the Hospital and ICU” were not 
shown to have a substantial effect on those met- 
rics. However, there was a significant effect of 
surveillance systems on in-hospital mortality. 


21.11 Clinical Control Tower 
and Population 
Management 


Clinical Control Tower is a newly-developed 
central alert-screening clinical surveillance 
system developed at Mayo Clinic. The con- 
cept behind the Clinical Control Tower is to 
serve as a centralized non-life-threatening 
alert and prediction “cockpit.” 

This unified screening system is managed 
by a designated capsule communicator or 
“CapCom,” analogous to the US National 
Aeronautics and Space Administration 
ground-based astronaut who maintains con- 
tact with astronauts during space missions. 
The CapCom in the healthcare context is the 
clinician responsible for screening incoming 
alerts and notifications. As no alerts have 100% 
accuracy it is essential to perform initial vali- 


dation of notifications before activating spe- 
cific workflows with bedside providers. When 
the CapCom decides that an alert is valid, he 
or she communicates “down to the ground” to 
a bedside clinician and guides them through 
necessary and recommended tasks. Each step 
may be captured electronically in the control 
tower application. Workflow and actions are 
captured and analyzed using a feedback loop 
tool. Deviations from intended care processes 
may be identified. Control Tower is a tool 
designed to minimize errors and information 
overload in hospital practice. 

As the COVID-19 pandemic hit the 
USA healthcare system, the Control Tower 
platform was modified and expanded to 
address the surveillance needs of hospitalized 
Covid-19 patients (@ Fig. 21.19). The sys- 
tem identifies the status of COVID lab tests, 
COVID results, patient isolation information 
and MEWS score. 

The features required to facilitate reli- 
able monitoring and management of acutely 
ill patient populations differ significantly 
from those required to manage single patient 
encounters. These demands are not easily met 
by the most common commercially available 
comprehensive electronic medical records 
and often require complementary alternative 
approaches such as the one illustrated in the 
Control Tower example above. 


(e) Suggested Readings 

Clemmer, T. P. (2004). Computers in the ICU: 
Where we started and where we are now. 
Journal of Critical Care, 19, 204-207. PMID 
15648035. 

Gardner, R. M. (2009). Clinical decision support 
systems: The fascination with closed-loop 
control. IMIA Yearbook of Medical 
Informatics, 12-21. PMID 19855866. 

Greenes, R. A. (Ed.). (2007). Clinical decision 
support: The road ahead. Burlington: Elsevier 
Inc. 

Harrison, A. M., Park, J. G., & Herasevich, V. 
(2015). Septic shock electronic surveillance. In 
Septic shock: Risk factors, management and 
prognosis (pp. 1-25). New York: Nova 
Biomedical. 

Herasevich, V., Kor, D. J., Subramanian, A., & 
Pickering, B. W. (2013). Connecting the dots: 
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Control Tower for COVID-19 
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Rule-based decision support systems in the 
modern EMR era. Journal of Clinical 
Monitoring Computing, 27(4), 443-448. 
PMID: 23456293. 

Kuperman, G. J., Gardner, R. M., & Pryor, T. A. 
(1991). HELP: A dynamic hospital informa- 
tion system. New York: Springer. 

Morris, A. H. (2001). Rational use of computer- 
ized protocols in the intensive care unit. 
Critical Care, 5, 249-254. PMID 11737899. 

Pickering, B. W., Litell, J. M., Herasevich, V., & 
Gajic, O. (2012). Clinical review: The hospital 
of the future - building intelligent environments 
to facilitate safe and effective acute care deliv- 
ery. Critical Care, 16(2), 220. PMID: 22546172. 

Shabot, M. M., & Gardner, R. M. (Eds.). (1994). 
Decision support systems in critical care. 
Boston: Springer. 


® Questions for Discussion 
1. Describe how the integration of 
information from multiple bedside 
monitoring signals, the pharmacy, and 
clinical laboratory data can help 
improve alarm systems used in an ICU. 


How would you decide whether to buy a 
standalone ICU patient monitoring 
system versus an integrated EMR sys- 
tem? 

How do care providers impact the instal- 
lation and optimization of real-time 
data collection and real-time decision 
support? 

Perhaps real-time data collection and 
computerized decision support are not 
necessary. How would you assess these 
issues? Is there sufficient literature to 
validate or disprove your supposition? If 
not, what is missing? 

How would you go about selecting the 
optimum data for monitoring and 
improving the care of a critically ill 
patient? 

How would you optimize a patient mon- 
itoring system that you were building or 
buying to provide the most accurate, 
timely, and helpful computerized deci- 
sion support capabilities? Be specific 
and give literature references to support 
your optimization plan. 
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7. If you were the Chief Clinical 
Information Officer of a large hospital 
without data from the ICU integrated 
into your EMR system, what factors 
would you have to consider to 
implement such a system AND to 
apply computerized clinical decision 
support to optimize such a system? 
How long do you think it would take 
to implement such a system? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What are the key components needed 
for Radiological Image interpretation? 

= What are the roles of the Radiology 
Information System (RIS), Picture 
Archiving and Communication System 
(PACS), Computer-Aided Diagnosis 
(CAD), Vendor Neutral Archives 
(VNAs), and Advanced Visualization 
Systems (AVS) in a typical medical 
imaging department? 

= How does the DICOM standard differ 
from HL-7 in the structure of its infor- 
mation model? 


22.1 Introduction 


In » Chap. 10, we introduced the concept 
of digital images as a fundamental data type 
that, because of its ubiquity, must be consid- 
ered in many applications. We furthermore 
defined biomedical imaging informatics as the 
study of methods for generating, manipulat- 
ing, managing, extracting, and representing 
imaging information. 

In this chapter, we continue the study 
of imaging informatics, begun in > Chap. 
10, by describing many of the methods for 
generating and manipulating images and 
discuss the relationship of these methods to 
structural informatics. We emphasize meth- 
ods for managing and integrating images, 
focusing on how images are acquired from 
imaging equipment, stored, transmitted, 
and presented for interpretation. We also 
focus on how these processes and the image 
information are integrated with other clini- 
cal information and used in the health care 
enterprise, so as to have an optimal impact 
on patient care. 

We discuss these issues in the context of 
Radiology, since imaging is the primary focus 


of that field.' Yet imaging is an important 
part of many other fields as well, includ- 
ing Pathology, Hematology, Dermatology, 
Ophthalmology, Gastroenterology, Cardio- 
logy, Surgery (for minimally invasive proce- 
dures especially) and Obstetrics, which often 
do their own imaging procedures; most other 
fields that use imaging rely on Radiology and 
Pathology for their imaging needs. 

The distribution of imaging responsibil- 
ity has given rise to the need of many depart- 
ments to address issues of image acquisition, 
storage, transmission, and interpretation. As 
these modalities have become increasingly dig- 
ital in form, the development of electronic sys- 
tems to support these tasks has been needed. 

We begin by describing some of the roles 
of imaging across all of biomedicine, then 
concentrate on image management and inte- 
gration in radiology systems, bringing in illus- 
trative examples from other disciplines where 
appropriate. Many Radiology departments 
are becoming highly distributed enterprises, 
with acquisition sites in intensive care units, 
regular patient floors, emergency departments, 
vascular services, screening centers, ambula- 
tory clinics, and in affiliated community-based 
practice settings. Interpretation of images may 
be in those locations when dedicated onsite 
radiologists are needed, such as for local inter- 
ventional procedures. Increasingly, however, 
high-speed networks are enabling interpreta- 
tion at sites far from acquisition, either in a 
central location or in widely distributed loca- 
tions according to the different capabilities or 
time zones of the organization. This is possi- 
ble because image acquisition and interpreta- 


1 The name Radiology is itself a misnomer, since the 
field is involved in using ultrasound, magnetic reso- 
nance, optical, thermal, and other non-radiation 
imaging modalities when appropriate. Radiology 
departments in some institutions are thus referred 
to alternatively as Departments of Medical Imaging 
or Diagnostic Imaging. 
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tion can be effectively decoupled. Independent 
imaging centers in a community face some of 
the same issues and opportunities, although to 
a lesser degree, so we focus primarily on the 
distributed medical center-based Radiology 
department in this chapter. 


22.2 Basic Concepts and Issues 


22.2.1 Roles for Imaging 


in Biomedicine 


Imaging is a central part of the healthcare pro- 
cess for diagnosis, treatment planning, image- 
guided interventions, assessment of response 
to treatment, and prediction of outcome. In 
addition, it plays important roles in medical 
communication and education, as well as in 
research. 


22.2.1.1 Detection and Diagnosis 
The primary uses of images are for the detec- 
tion of medical abnormalities, for diagnosing 
the nature of those abnormalities, and for 
planning and guiding therapeutic interven- 
tions. Detection focuses on identifying the 
presence of an abnormality, but in the case 
where the findings are not sufficiently spe- 
cific to be characteristic of a particular dis- 
ease, other information is required for actual 
diagnosis. This is the case, for example, with 
mammograms, which are often used to screen 
for breast cancer; once a suspicious lesion 
is detected, a biopsy procedure is usually 
required for diagnosis. In other circumstances, 
the image finding is diagnostic: for example, 
the finding of focal stenosis or obstruction 
of an artery during angiography. Most often 
there is a continuum between detection and 
diagnosis, with imaging detecting a lesion 
with some range of confidence, and suggest- 
ing some possibilities, known as the differen- 
tial diagnosis (see >» Chap. 2). 

Diagnosis and detection can be done with 
a wide variety of imaging procedures. Images 
produced by visible light can be used by oph- 
thalmologists for retinal photography, but also 
by dermatologists to view skin lesions, or by 
pathologists for light microscopy. The visible- 
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light spectrum is also responsible for producing 
images seen endoscopically, typically captured 
as video images or sequences(movies). Sound 
energy, in the form of echoes from internal 
structures, is used to form images in ultra- 
sound, a modality used heavily in cardiac, 
abdominal, pelvic, breast, thyroid, testes, and 
obstetrical imaging. In addition, Doppler shifts 
of sound frequency can measure blood flow 
velocity in both arteries and veins and newer 
microvascular imaging methods can measure 
perfusion in capillary beds. X-ray energy pro- 
duces radiographic and computed-tomography 
(CT) images of most parts of the body: the dif- 
ferential absorption of X-rays by various tis- 
sues produces the varying densities that enable 
radiographic images to portray normal and 
abnormal structures. More recent techniques 
that separate the various energies of the X-ray 
beam allow for more precise characterization of 
tissue composition, a simple form being known 
as dual-energy CT, and more advanced forms 
known as spectral CT. Emission of radioactive 
particles by isotopes that are incorporated into 
various types of molecules are used to produce 
nuclear-medicine images, which reflect the dif- 
ferential concentration of those molecules in 
various tissues. Magnetic-resonance imaging 
(MRI) depicts energy fluctuations of certain 
atomic nuclei—usually hydrogen—when they 
are aligned in a magnetic field and then per- 
turbed by a radiofrequency pulse. Parameters 
such as proton density, the rate at which the 
nuclei return to alignment (T1), the rate of loss 
of phase coherence after the pulse (T2), diffu- 
sion of water, and even the concentration of 
certain chemicals (MR Spectroscopy) can be 
measured. These quantities differ in various 
tissues under normal conditions, with more 
variations due to disease, thus enabling MRI to 
distinguish among them. @ Figure 22.1 shows 
some example images. It is also possible to use 
MRI methods to accentuate and measure the 
flow of fluids like blood or CSF, known as 
Magnetic Resonance Angiography (MRA). 


22.2.1.2 Assessment and Planning 

In addition to being used for detection and 
diagnosis, imaging is often used to assess a 
patient’s health status in terms of progression 
of a disease process (such as determination of 
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O Fig.22.1 Examples of the types of images discussed in the text. (a) is a microscopic image of tissue stained with 
hematoxylin and eosin; (b) is an ultrasound image of the thyroid; (c) is a contrast-enhanced CT image of the abdo- 
men; (d) is contrast-enhanced MRI of the brain; (e) is an MR angiogram of the cervical vessels; (f) is an FDG PET- 
CT image of the upper body; (g) is a photgraph of an eye (Dermatopath, US, CDUS, CT, MRI, MRI-DWI) 
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tumor stage), response to treatment, and esti- 
mation of prognosis. One can analyze cardiac 
status by assessing the heart’s size and motion 
echocardiographically. Similarly, one can use 
ultrasound to assess fetal size and growth. 
Computed tomography is used frequently to 
determine approaches for surgery or for radia- 
tion therapy. In the latter case, precise calcula- 
tions of radiation-beam configuration can be 
optimized to maximize dose to the tumor while 
minimizing absorption of radiation by sur- 
rounding tissues. This calculation is often per- 
formed by simulating multiple radiation-beam 
configurations and iterating to a best treatment 
plan. For surgical planning, three-dimensional 
volumes of CT or MRI data can be constructed 
and presented for viewing from different per- 
spectives to facilitate determination of the most 
appropriate surgical approach. More recently, 
creation of 3D printed models using 3D print- 
ers has become very popular for surgical plan- 
ning and patient education. In some cases, the 
printed object is even used to guide incisions or 
implanted as a scaffold for tissue ingrowth. 


22.2.1.3 Image-Guided Procedures 
Images can provide real-time guidance when 
virtual-reality (VR) images are superim- 
posed on a surgeon’s visual perspective on the 
appropriate image view in the projection that 
demonstrates the abnormality, a technique 
known as augmented reality (AR). With 
endoscopic and minimally invasive surgery, 
this kind of imaging can provide a localiz- 
ing context for visualizing and orienting the 
endoscopic findings, and can enable monitor- 
ing of results of interventions such as focused 
ultrasound, cryosurgery, or thermal ablation. 
It is also possible to use intra-operative imag- 
ing to update the position and appearance of 
pre-operative imaging used for procedural 
planning. © Figure 22.2 shows an example of 
a CT-guided biopsy of a lesion in the neck. 
High quality imaging allows precise targeting 
of small targets such as diseased lymph nodes 
with little risk of damaging important nearby 
structures like the aorta or carotid artery. 
Improvements in robotics technology and 
wide-area network capability have enabled 
minimally invasive procedures to be conducted 
at a distance (see ® Chap. 20), although it is 
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O Fig. 22.2 Example of a CT-guided biopsy of a 
lesion in the neck. High quality imaging allows precise 
targeting of small targets even near important structures 
like the carotid artery 


still practical to do so only in limited settings. 
Because the abnormality is viewed through a 
video display, the image source can be physi- 
cally remote, a technique called telepresence. 
Similarly, the manipulation of the endoscope 
itself can be controlled by a robotic device 
that reproduces the hand movements of a 
remote operator, and can provide haptic feed- 
back reproducing the sensations of tissue 
textures, margins, and resistance. This tech- 
nology is not too different from the robotic 
surgery methods that have become quite com- 
mon today, though the practical limits noted 
above have limited its use. 


22.2.1.4 Communication 

Medical decision-making, including diagno- 
sis and treatment planning, is often aided by 
allowing clinicians to visualize images concur- 
rently with textual reports and discussions of 
interpretations. Thus, we consider imaging 
to be an important adjunct to communica- 
tion and images to be a desirable component 
of a multimedia electronic medical record. 
Because medical imaging is an essential ele- 
ment of the practice of medicine, support 
for transmission and remote image viewing 
is also a critical component of telemedicine 
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(> Chap. 20). Medical images can also be 
helpful in doctor-patient communication, to 
enable the provider to illustrate an abnormal- 
ity or explain a surgical procedure to a patient 
(> Chap. 11). 


22.2.1.5 Education and Training 
Images, whether 2D, 3D (either 3 spatial 
dimensions or 2D plus time), or 4D (3 spa- 
tial dimensions plus time) are an essential part 
of medical education and training because 
so much of medical diagnosis and treatment 
depends on imaging and on the skills needed 
to interpret such images (see > Chap. 24). 
Case libraries, tutorials, atlases, quiz libraries, 
and other resources using images can provide 
this kind of educational support. Three- 
dimensional printed models of both normal 
structures and various pathologic conditions 
are now being routinely created for patient 
and physician education. The ability to hold 
the structure in one’s hand, and manipulate 
it have proven very valuable, particularly 
in cases that are unusual such as congenital 
deformities. 

Taking a history, performing a physical 
examination, and conducting medical proce- 
dures also demand appropriate visualization 
and observation skills. Training in these skills 
can be augmented by viewing images and 
video sequences, as well as through practice 
in simulated situations. An example of the lat- 
ter is an approach to training individuals in 
endoscopy techniques by using a mannequin 
and video images in conjunction with tactile 
and visual feedback that correlate with the 
manipulations being carried out. 

As noted in the previous section, patients 
increasingly expect to understand more 
about their disease, and patient communica- 
tions can be more effective by including rel- 
evant images. Imaging also has a consumer/ 
patient education benefit, since access to 
appropriate images can be included along 
with the provision of instructions and edu- 
cational materials to patients, whether that 
is about their disease, the procedures to be 
performed, required follow-up care, or about 
healthy lifestyles. 


O Fig. 22.3 Example of a detailed segmentation of the 
brain into various anatomic structures by the FreeSurfer 
package. It uses a combination of image intensities and 
expected shapes for the brain and substructures to pro- 
duce its output 


22.2.1.6 Research 

Imaging is also a critical component of many 
aspects of research. An example is structural 
modeling of DNA and proteins, including 
their 3D and 4D configurations (see >» Chap. 
9). Images obtained in molecular or cellular 
biology can show the distributions of fluo- 
rescent or radioactively tagged molecules 
through time or space. The study of morpho- 
metrics, which is literally the measurement 
of shape, depends on the use of imaging 
methods. @ Figure 22.3 shows an example 
of a detailed segmentation of the brain into 
various anatomic structures by the FreeSurfer 
package.” It uses a combination of image 
intensities and expected shapes for the brain 
and substructures to produce its output. 
Functional mapping—for example, of the 
human brain—relates specific sites on images 
to particular functions. While such quantita- 
tive imaging efforts often begin in the labora- 
tory, translation of such quantitative methods 
is increasingly important to the practice of 
medicine. @ Figure 22.4 provides an exam- 
ple of functional mapping of a patient with 
a brain tumor, where functional mapping is 
used to identify critical structures, and thus to 
guide surgical therapy. 


2 > http://surfer.nmr.mgh.harvard.edu/ 
4/27/2018). 
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O Fig. 22.4 Example of functional mapping of a patient with a brain tumor, where functional mapping is used to 
identify critical structures, and thus guide surgical therapy 


22.2.2 The Radiologic Process 
and its Interactions 


As noted in the introduction, we concentrate in 
this chapter on the subset of imaging that falls 
under the purview of Radiology. Radiology 
departments are engaged in all aspects of the 
healthcare process, from detection and diag- 
nosis to treatment, follow-up and prognosis 
assessment. Radiology also illustrates well the 
many issues involved in acquiring and man- 
aging images, interpreting them, and commu- 
nicating those interpretations. Space does not 
permit us to discuss the other disciplines that 
utilize imaging, but the processes involved and 
issues faced which we discuss in the context 
of radiology, pertain to the other disciplines 
also. Additional examples are also provided 
in > Chap. 10. Occasionally, we intersperse 
examples from other areas, where we wish to 
emphasize a particular point, and imaging for 
educational purposes is discussed at length in 
> Chap. 25. 

The primary function of a Radiology 
department is the acquisition, analysis, and 
interpretation of medical images but also 
increasingly, the conduct of minimally inva- 


sive image-guided procedures, an area usu- 
ally referred to as Interventional Radiology. 
Through imaging, healthcare personnel 
obtain information that can help them to 
establish diagnoses, to plan or administer 
therapy, and to follow the courses of diseases 
or therapies. 

Diagnostic studies in the Radiology 
department are typically performed at the 
request of referring clinicians, who then use 
the information for subsequent decision- 
making. The Radiology department produces 
the images, and the radiologist provides the 
primary analysis and interpretation of the 
radiologic findings. Thus, radiologists play 
a direct role in clinical problem-solving and 
in diagnostic-work-up planning for many 
patients. Interventional radiology and image- 
guided surgery (if done by the radiologist) are 
activities in which the radiologist plays a pri- 
mary role in treatment. 

The complete radiologic process (Greenes 
1989) is characterized by seven kinds of tasks, 
each of which involves information exchange, 
which may be augmented by information 
technology, as illustrated in Ø Fig. 22.5. The 
first five tasks occur in sequence, whereas the 


22 


740 B. J. Erickson 


O Fig.22.5 The radiologic 
interpretation process 


Communicate 
results & 
recommendations 


final two are done in parallel and are ongoing 
and support the other five. 


1. 


The process begins with an evaluation by a 
clinician of a clinical problem and deter- 
mination of the need for an imaging proce- 
dure. Decision-support tools (® Chap. 24) 
are commonly used to help determine if, 
and what type of, testing should be per- 
formed. 

The procedure is requested and scheduled, 
the indication for the procedure is stated, 
and relevant clinical history is made 
available. 

The imaging procedure is carried out, and 
images are acquired. An important step 
for many types of examinations is the ‘pro- 
tocoling’ in which the precise way that the 
Images are acquired are specified by the 
radiologist. For example, whether oral or 
IV contrast are administered, the imaging 
plane(s), slice thickness, and contrast 
properties are also specified. 

The radiologist reviews the images in the 
context of the clinical history and indica- 
tions for the examination and may mea- 
sure structures in the images, segment 
components of the image (e.g. measure 
volumes such as the left ventricle), or 
manipulate the images (e.g. create 3D ren- 
derings or perform processing such as con- 
version of a series of images into a new 
parametric image like a blood volume 


Analyze & 
interpret findings 


Asses clinical 
problem 


Assess quality/ 
monitor 
performance 


Request & 
schedule exam 


N Educate/train, 
provide 
feedback 


Perform exam 


image). This task actually involves inter- 
related subtasks: (a) detection of the rele- 
vant findings and (b) interpretation of 
those findings in terms of clinical meaning 
and significance. 


. The radiologist creates a report and may 


also directly communicate the results to 
the referring clinician, as well as making 
suggestions for further evaluation as 
needed. In the past, this was free text, but 
there is increasing use of templates that 
result in a consistent pattern in the report, 
or structured reports in which the textual 
report also exists in a form with codes for 
all concepts in the textual report. The 
annotation and markup of images can be 
very helpful in communicating locations 
of findings and serves as helpful land- 
marks for subsequent exams, and for sur- 
gical or radiation procedures. 


. Quality control and monitoring are car- 


ried out throughout the process, with the 
aim of improving the foregoing processes. 
Factors such as patient waiting times, 
workloads, numbers of exposures obtained 
per procedure, quality of images (such as 
ones degraded by patient motion or incor- 
rect acquisition parameters), radiation 
dose, yields of procedures, incidence of 
complications, and quality of reports are 
measured, reported and adjusted to opti- 
mize individual and overall quality. 
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7. Continuing education and training are car- 
ried out through a variety of methods, 
including access to atlases, review materials, 
teaching-file cases, and feedback of subse- 
quently confirmed diagnoses to interpreting 
radiologists. Peer review of previously 
reported cases is now a common expecta- 
tion or requirement in most radiology prac- 
tices as well as medical board agencies like 
the American Board of Radiology. 


All these tasks are now, in a growing number of 
departments, computer-assisted or automated, 
and most of them involve images in some way. 
In fact, radiology is one branch of medicine 
in which even the basic data are usually pro- 
duced by computers and stored directly in 
computer memory. Radiology has also contrib- 
uted strongly to advances in computer-aided 
instruction (see > Chap. 25), in technology 
assessment (see > Chap. 13), and in clinical 
decision support (see > Chap. 24). Speech rec- 
ognition is commonly used for report creation. 


22.2.3 Electronic Imaging Systems 


22.2.3.1 Image Acquisition 


The first radiographs used an integrated 
detection, recording and display system— 
that is, the glass plate (and later, plastic film) 
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served both to detect the X-ray photons, to 
record them in a permanent form, and also 
to display the data (with the aid of a light- 
box). This integrated arrangement existed 
for about a century. Today, most radiographs 
are either (a) recorded in a latent form (i.e., 
they are not directly visible, but are acquired 
as an electronic signal on a charged plate) 
where a ‘reader’ then scans the plate to create 
a digital image (known as computed radiog- 
raphy or CR) or (b) the photons are directly 
converted to digital images (known as digital 
radiography or DR). The digital image can 
then be transmitted and stored like any digital 
data, using conventional networks and stor- 
age systems. The matrix size of the images is 
variable, ranging from as low as 64 x 64 for 
some nuclear medicine images, up to 5000 x 
4000 picture elements (pixels) for mammo- 
grams. The size of typical radiology images 
and examinations is shown in @ Table 22.1. 
CT, MR, US, NM, and PET all use computer 
to convert the acquired raw signal to a digital 
image and thus they are also exist as direct or 
nearly direct digital modalities. 


22.2.3.2 DICOM 


The first medical devices to produce digital 
images routinely were CT scanners, and soon 
after, MRI scanners. The availability of digi- 
tal data that represented a three-dimensional 


O Table 22.1 Typical sizes for radiology examinations 

Modality* Image size (pixels/image) Images/exam Exam size (MB) 
CR/DR 5000,000 3 29 

CT 262,144 500 250 

MRI 65,536 500 63 

US 262,144 50 25 
Mammography 20,000,000 4 153 
Interventional/fluoro 1,048,576 50 100 

Nuclear medicine 16,384 25 1 


Note that there is variability in image size and images per examination, and these numbers should be viewed 
as very rough estimates. Furthermore, there is a strong trend for both increased image resolution (increasing 
image size) and more images per examination since the emergence of digital imaging 

*CR computer radiography, DR digital radiography, CT computed tomography, MRI magnetic resonance 


imaging, US ultrasound 
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image stimulated the field of medical image 
processing and 3D rendering. An early chal- 
lenge to such investigations was that the medi- 
cal device vendors used half-inch tape media 
for storing the data, but each vendor (and 
usually each model of scanner) had its own 
format. Such formats were proprietary, and 
required each investigator to reverse engineer 
the format of the tape just to gain access to 
the data. Although computer networks were 
used in hospitals at that time, few if any scan- 
ners supported network connections. 

The need to write all data to tape, and then 
read it into a different computer using soft- 
ware unique to each scanner resulted in signifi- 
cant unnecessary effort. The need to exchange 
images efficiently demanded that they be 
represented in a standard fashion. This need 
was recognized by the American College of 
Radiology (ACR) and the National Electrical 
Manufacturer’s Association (NEMA), and 
led to the development of the ACR/NEMA 
standard for medical images in 1985. As other 
imaging devices started to produce digital 
images, and as the information about the 
images became richer, the second version was 
published in 1989. That standard described 
both a model that described the data, and pre- 
scribed a special connector for transferring 
image data between devices. This was dem- 
onstrated at the 1990 Radiological Society 
of North America (RSNA) conference. Soon 
after, TCP/IP became a widely accepted net- 
work standard, and while the ACR/NEMA 
standard did not describe a method for trans- 
ferring data over TCP/IP, investigators fairly 
quickly implemented this, and it worked well. 
Continued improvements in the information 
model, as well as extension to medical spe- 
cialties other than radiology, and standards 
for storage on physical media like compact 
disks demanded further revisions. The addi- 
tion of non-radiology images also demanded 
a name change, and thus ACR/NEMA 3.0 
was rebranded as DICOM, which stands for 
Digital Image Communications in Medicine. 

To promote adoption, the RSNA com- 
missioned the creation of Central Test Node 
(CTN) software, for demonstrating use of 
the standard for transmitting images over a 


local area network at RSNA 1992, followed 
by increasingly sophisticated versions over the 
next 2 years. The RSNA also made that soft- 
ware available for free public access as a model 
for understanding the standard and design 
of utilities and tools by developers. During 
the mid 1990’s, the RSNA annual meetings 
hosted a major digital image interoperabil- 
ity demonstration that became progressively 
more sophisticated and demanding. RSNA 
and its meetings accordingly facilitated dem- 
onstration of the interconnection of vendor 
products through the Internet, promoted 
DICOM compatibility as a feature that could 
be visualized at participating vendor exhibits, 
and created a model Request For Proposals 
(RFP) for radiology practices and hospitals 
to use to craft a DICOM requirement as part 
of the procurement of imaging systems. Later 
this interoperability testing was done separate 
from the RSNA annual meeting, and became 
known as the ‘Connectathon’. These efforts 
turned out to be extremely successful in trans- 
forming the marketplace from one that was 
dominated by proprietary formats to one that 
was standards-based and interoperable. 

DICOM continues to be updated and 
improved through an international committee 
process. While it is hard for any standard to 
be both widely accepted and perfectly up-to- 
date, the DICOM governance has done a 
remarkable job of adapting to rapid advances 
in imaging technology. The governance con- 
tinues to reflect its roots of combining indus- 
try and medical experts who are interested in 
providing the best technology that can be put 
into commercial products. 


22.2.33 Image Transmission, 

Storage, and Display 

Digital image capture provides the opportu- 
nity to display and store the images in digital 
form. In the early days, the size of the images 
represented a challenge—the amount of data 
was quite large relative to the capacity of 
storage devices. As a consequence, there was 
intense interest in using compression methods 
that could reduce the amount of storage that 
was required—as well as increase the speed of 
network transmission of images. Even with 
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compression, the amount of storage used for 
images is quite large relative to non-image data 
stored in a hospital. A hospital must there- 
fore carefully consider how images are stored. 
In the early days of Picture Archiving and 
Communications Systems (PACS), there was 
little choice about how and where images were 
stored, because the storage system was tightly 
integrated with the display and transmission. 
This was done because the high demands on 
storage, transmission, and display all required 
special hardware. As computer technology 
caught up with medical image sizes, there was 
less need for specialized versions of networks, 
archives, and displays. 

The military was an early adopter/driver of 
PACS technology, and released an RFP in the 
1990’s requiring that ‘any image be displayed 
anywhere on the network within 2 seconds.’ 
To address this requirement, early PACS uti- 
lized either uncommon standard technologies 
or proprietary networking methods to provide 
high bandwidth transmission. An example of 
proprietary transmission technology is the 
PACS developed by LORAL, which leveraged 
technology developed for its defense applica- 
tions. Its network was a hybrid of (standard) 
10 Mbps Ethernet, which provided control 
signaling, and a (proprietary) unidirectional 
hub to spoke optical network that had lossless 
compression built into the network card. The 
optical network signaled at 100 Mbps, and 
because it was unidirectional, it routinely real- 
ized its nominal speed. Other vendors utilized 
FDDI, which also was an optical fiber network 
that signaled at 100 Mbps. However, its han- 
dling of contention was much less effective, 
and its performance suffered. Today, stan- 
dard Ethernet signaling at either 100 Mbps 
or | Gbps can provide adequate performance, 
as long as reasonable attention is paid to net- 
work layout and implementation. 

While the display component of PACS 
drove networking advances in the 1990’s, the 
migration of PACS data from one system to 
another during an upgrade drove the next 
major change—the Vendor Neutral Archive, 
or VNA. In the early days of PACS, updating 
the system to take advantage of new worksta- 
tion or network technology meant that the 
whole system, including the archive, needed 
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to be updated or migrated. Because the data 
were not stored in a standard format, it was 
necessary to get the cooperation of the vendor 
to migrate the data to the new system (which 
might be from another vendor!). Because 
workstations rapidly change, but archive 
contents do not, there was perceived value in 
separating these two functions (Erickson and 
Hangiandreou 1998). Today, several compa- 
nies sell vendor-neutral archives (VNAs) that 
leverage the DICOM standard. These allow a 
wide variety of image-producing and image- 
consuming systems to access the archive in 
a standard fashion. It also enables storing 
images from outside the radiology depart- 
ment on the same infrastructure. In fact, the 
use of a VNA is now the norm rather than 
the exception at any major hospital system 
because it allows sharing of resources across 
departments. 

Because the image datasets are quite large, 
there is interest in finding ways to reduce stor- 
age requirements. Image compression does 
exactly this, in one of two ways. There are 
lossless compression methods, which encode 
redundancy in the image in a way that allows 
the original to be exactly reproduced. Lossy 
(or irreversible) compression produces an 
image that is visually similar to the original. 
Exactly how similar depends on the algorithm 
and user-selectable settings that reflect the 
trade-off between fidelity and compression 
ratio (the ratio of original size to compressed 
size). The major challenge is that one can- 
not select a given setting, reliably get images 
that are not visibly altered, and also achieve 
a good compression ratio. Lossless compres- 
sion methods typically achieve compression 
ratios of about 2.5, while lossy compres- 
sion can achieve as much as 40:1 compres- 
sion without a perceptible or diagnostic loss, 
for certain types of images. While the size of 
imaging examinations continues to increase, 
the decrease in storage cost is more rapid, 
lessening the demand for lossy compression. 
The use of lossy compression is more widely 
accepted in non-radiology specialties, such as 
cardiology and pathology, in part because of 
the greater uniformity of image characteris- 
tics, allowing easy specification of acceptable 
ratios. A key goal of lossy compression is that 
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it not have an adverse impact on diagnostic 
value to the human or to computer aided diag- 
nostic algorithms (Zheng et al. 2000). Thus, 
lossy methods are usually tuned to the spe- 
cific diagnostic task so as not to have adverse 
impacts. In fact, monitoring the performance 
of CAD as the ratio is changed is one method 
to select the optimal compression ratio. 

Early PACS also required specialized dis- 
play devices. At the time, standard computer 
displays were often 640 x 480. Medical images 
were often more than 2048 pixels in each 
direction. Liquid crystal technology for large 
displays was also not developed, meaning 
that the displays were large cathode ray tubes. 
These displays were large, heavy, produced 
much heat, and degraded rather rapidly. 
Nearly all were monochrome. Imaging also 
required a higher luminance for detection of 
subtle gray level distinctions than was avail- 
able with consumer-grade displays. Today, flat 
panel technology that meets the demands of 
radiological interpretation is widely available 
at reasonable prices. We note here that while 
consumer grade displays can be used, it is 
important that quality displays are used with 
appropriate calibration. A DICOM com- 
mittee has established display requirements 
for medical purposes (ACR-NEMA 2006). 
Medical grade monitors typically have hard- 
ware built into the display to perform such 
calibration in an automated way. 


22.2.4 Integration with Other 
Healthcare Information 
22.2.4.1 Radiology Information 


Systems (RIS) 


A Radiology Information System, or RIS, is 
responsible for much of the text information 
in a radiology department. Core functions of 
a RIS include capture of the interpretation 
for a given examination and records the status 
of an imaging examination. A RIS mayo do 
more, depending on what other systems are 
available and preferred for a given situation: 
some of the functions that might be performed 
by a RIS include ordering, scheduling, and 
billing. These functions are usually performed 


by the HIS or EHR system when available 
such as in a hospital or large outpatient prac- 
tice; they are usually performed by the RIS in 
cases where there is no HIS or EHR (such as 
many outpatient imaging centers). 


22.2.4.2 Speech Recognition 


In the past, the RIS would provide a means 
for a transcriptionist to type the text of the 
report into the RIS as the radiologist dictated 
it (either live or via dictation system). Today, 
the vast majority of radiologists use speech 
recognition to convert their speech into text. 
In some cases, the text is immediately reviewed 
by the radiologist and approved as final. This 
model has the advantages of rapid turn- 
around time—the time from when the exami- 
nation is ready to be reported to the time it 
has a final report available. In this model, a 
separate application (the speech recognition 
system) converts the audio to an HL-7 mes- 
sage, which is sent to the RIS along with the 
final (or other appropriate status). In other 
cases, a ‘correctionist’ reviews the text created 
by the speech recognition system, and cor- 
rects it based on listening to the audio file. In 
this case, the radiologist must then review the 
text again to make it final, which will degrade 
turn-around-time. 

There are two major advantages to using 
speech recognition: First, it enables rapid 
turn-around time. Before speech recognition, 
turn-around times of 1 week were common, 
but now, turn-around times of less than 1 hour 
is common (Hart et al. 2010; Krishnaral et al. 
2010; Mattern et al. 1999). This improvement 
in turn-around time undoubtedly improves the 
quality of care provided to patients. Second, 
it reduces staffing for radiology depart- 
ments or hospitals by reducing the number 
of transcriptionists/correctionists needed. Of 
course, some decrease in productivity is com- 
monly observed for radiologists at the time of 
implementation, which reduces the economic 
benefit (Langer 2002; Strahan and Schneider- 
Kolsky 2010). 

Over the years, many efforts have attempted 
to enable radiologists to generate templated 
or structured reports (SR) from a selection 
of choices in forms, and through use of drop- 
down entries in the text, macros that produce 
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predetermined text phrases, and other tech- 
niques. Some of these are now used in specific 
situations, especially where reports have a 
largely anticipated format and structure, e.g., 
mammography and obstetrical ultrasound, 
and macros are used in conjunction with 
speech recognition approaches for certain 
“canned” sections or reports. Such reports 
enable efficient capture of the information for 
later data analysis, and there are also several 
reports showing that most referring physi- 
cians prefer templated reports because of the 
consistency in where key information can be 
found. In fact, some ultrasound devices will 
send many of the key measurements to the 
reporting system to ‘pre-populate’ a struc- 
tured report with many of the measurements 
made during image acquisition. 


22.2.4.3 Computer-Aided Diagnosis 
(CAD) 

We have described that the interpretation task 
consists of detection, description, and diag- 
nosis. In some cases, the detection task can 
be quite challenging, particularly for screen- 
ing tasks involving mammography and chest 
X-rays because the incidence is rather low, and 
the volume is high. Particularly at the end of 
a long shift, human observers probably have 
decreased performance due to fatigue. For 
these cases, computer algorithms that high- 
light suspicious regions of an image may be 
useful to assure that important findings aren’t 
missed. Some have called this role ‘computer- 
aided diligence’. 

During the first decade of computer-aided 
diagnosis (CAD), the algorithms searched 
for specific features that radiologists thought 
were important, and therefore the value was 
either minimal (diligence), or provides benefit 
readers less familiar with the importance of 
various features (Gur and Sumkin 2006). In 
the case of mammography, the lack of a clear 
benefit for experienced readers, combined 
with the reduction in productivity has resulted 
in its near complete abandonment, and also 
its loss of reimbursement. This is a sobering 
example that computer assistance must be 
implemented in ways that truly add value. 
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Another role for CAD is in assisting with 
diagnosis (the lesion is detected, but unsure if 
it is cancer or infection). A common applica- 
tion is in determination the nature of lesions 
on high resolution CT images of the chest. 
O Figure 22.6 shows an image of the output 
of the experimental algorithm CALIPER, 
rendered as a 3D image, to show the distribu- 
tion and change of different degrees of inter- 
stitial lung disease in a patient. 

More recently, a machine learning tech- 
nique referred to as ‘Deep Learning’, has 
become popular. This technique uses neural 
networks and derives its ‘deep’ name because 
it typically uses many (50+) layers in the net- 
work. In addition to having many more layers 
than traditional neural networks, some forms 
include several convolutional layers (hence 
the name convolutional neural networks or 
CNNs) at the input that learn the features 
that produce the best output. This means that 
it learns the best features to use, rather than 
requiring a human to pre-compute them. 

Deep Learning methods have proven very 
effective for many of the traditional CAD 
tasks (mammography and chest CT lesion 
detection) and for classification of lesions. 
They also excel at automated organ segmen- 
tation (U-Nets) that can be useful for many 
tasks, and this easy access to quantitative data 
will likely increase the quantitative content in 
radiology reports. 

Perhaps more interesting is that CNNs 
have also proven effective at predicting 
molecular properties of tissues using routine 
images, even in cases where humans are not 
able to identify any differentiating features. 
Examples include the prediction of IDH-1 
mutation, 1p19q chromosomal status, and 
MGMT methylation in brain tumors using 
routine T2-weighted MR. 


22.2.4.4 Advanced Visualization 


CT and MR scanners provide images that can 
be thought of as 3D images, even if they are 
not always truly acquired as 3D, but rather, as 
a series or ‘stack’ of 2D images. Some imaging 
devices can acquire a 3D image directly and 
repeatedly, thus producing a 4D image (time 
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O Fig. 22.6 Image of the output of the experimental algorithm CALIPER, rendered as a 3D image, to show the 
distribution and change of different degrees of interstitial lung disease in a patient. (a) shows the overall disease 
burden of the lungs; (b) shows the disease category as a 3D rendering of the lungs; (c) shows the disease category on 


a coronal section of the lungs; (d) shows the disease category labels 


is the fourth dimension). In particular, cardiac 
imaging benefits from 4D capability so that 
the beating heart can be examined throughout 
the cardiac cycle. Such 3D and 4D data sets 
are large, and proper demonstration of the 
important findings requires visualization of 
the data. For instance, if one wishes to see a 
skeletal finding using CT, one can set a thresh- 
old to select bony structures, and then render 
it using traditional computer rendering meth- 
ods. This can be done at multiple time points 
to produce movies of moving structures. 

The great challenge in medical visualiza- 
tion is segmentation—deciding whether a 


volume element (voxel) is a part of the struc- 
ture of interest or not. In the case of a CT 
image of bone, segmentation is rather easy. 
If intravascular contrast is administered dur- 
ing the examination, that can make it fairly 
straightforward to select vessels (arteries and/ 
or veins depending on the timing). Soft tissue 
organs like livers, kidneys, and muscles have 
been more challenging, but are now much 
more feasible with deep learning techniques. 
A description of the rendering algorithms and 
their trade-offs is provided in > Chap. 10. 

A recent advance in visualization tools is 
to have the computation done on a central 
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O Fig. 22.7 Example advanced visualization of the abdominal aorta, distributed via a web client, which allows 
vascular surgeons to better plan surgical options in their own office 


server, with interactive segmentation and 
rendering viewed using web browsers. This 
allows a much larger population of physi- 
cians to have access, and can be valuable for 
surgeons contemplating surgery, as well as for 
patient education. © Figure 22.7 provides an 
example advanced visualization of the spine, 
distributed via a web client, which allows 
surgeons to better plan and review treatment 
options in their own office, with the patient. 


22.2.4.5 Advanced Reporting 

Whiletextualreports have served medical prac- 
tice well for the past century, there are oppor- 
tunities to improve reporting. Multimedia 
reports provide a richer representation of the 
information present in the examination, and 
might include links from portions of the text 
report to specific images and locations on the 
images, moving images (‘video’), or audio files 
such as the heart sounds. In some diseases, it 
can be important to have specific measure- 
ments made, and possibly tracked over time. 
If these measurements are encoded in a spe- 
cific way (using Structured Reporting or SR), 
it will be easier to extract and use that infor- 
mation elsewhere in the medical record, and 
for other purposes like research. Lexicons, 
such as RadLex, can be helpful in conveying 
some of the information. There is great inter- 


est in routinely collecting more quantitative 
information from images, because it appears 
that for an increasing number of diseases, 
quantitation is receiving increased attention 
in clinical realms. 

An SR is produced when the concepts of 
a report are represented using coded termi- 
nology. There is a DICOM specification for 
SR, though adoption has been limited. This 
is because there are currently not efficient user 
interfaces for creation of structured reports 
in most areas of radiology. BIRADS is a 
standardized way to report breast imaging, 
with an accepted scale for findings. However, 
those findings are usually not also stored in 
an encoded format, but rather with highly 
consistent text. Because BIRADS has truly 
enabled much better care of patients and 
research, other areas of imaging are adopt- 
ing standardized reporting (e.g. TIRADS 
for thyroid, LIRADS for liver, PIRADS for 
prostate, just to name a few). The most widely 
adopted example of SR is probably the send- 
ing of specific measurements from ultrasound 
scanners to reporting systems. Of course, this 
is not the entire content of the report, and the 
radiologist usually adds and may edit the SR 
content. Another example where DICOM SR 
may be used is for reporting radiation dose, 
but again, that is not even the main content 
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of the radiological report. Some would also 
argue that DICOM SR is more correctly 
thought of as structured results rather than 
structure reports. 


22.2.4.6 Workflow Management 
(Including Dashboards) 

The ability to monitor and control events in 
an imaging department is critical to efficient 
and effective operation. Dashboards have been 
applied in many business arenas to give quick 
visual displays of Key Performance Indicators 
(KPIs) for that business. Such dashboard 
technology is now becoming widely used 
within imaging departments for monitoring 
such KPIs as report turn-around time, patient 
waiting time, number of days out to schedule 
examinations, and revenue days-outstanding. 
The dashboards give people a quick view 
of what is happening, and can alert them to 
problem areas. © Figure 22.8 is an example 
of a radiology dashboard that shows impor- 
tant departmental metrics, including report 


turn-around time, compliance with notifica- 
tion requirements, and patient waiting times. 

Most dashboards provide a mechanism 
to ‘drill down’ on a particular measurement. 
For instance, if the patient waiting time moni- 
tor goes ‘red’, clicking on that indicator light 
might show the waiting time by location 
(maybe just one facility is causing the prob- 
lem), total patient volume (maybe the site is 
experiencing a spike in patient volume), or 
examination time (maybe the complexity of 
examinations is going up). Such informa- 
tion is critical to enabling a timely and effec- 
tive response to performance that is outside 
expected service levels. 

The popularity of deep-learning-based 
artificial intelligence (AI) tools for image 
interpretation has also driven the need for 
workflow orchestration: the application of 
workflow management technology to collect 
data needed by an AI algorithm, assure its 
execution on the images and other data, and 
then collect and distribute the results. There 
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O Fig. 22.8 Example of a radiology dashboard that shows important departmental metrics, including report turn- 
around time, compliance with notification requirements, and patient waiting times 
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are many reports showing Al can improve 
efficiency (such as for quantitation of organ 
or tumor size) and to improve the quality of 
care in radiology departments. 


22.2.4.7 Teleradiology 

Teleradiology is the practice of interpreting 
images at a location that is physically distant 
from the place where the images are collected. 
Initially, this referred to transmitting images 
from the hospital to the radiologist’s home in 
the middle of the night so that the physician 
did not need to drive in to see the image onsite. 
While this is still done, it is now common for a 
hospital to contract with a ‘nighthawk’ service 
that will provide these night-time interpreta- 
tions. A nighthawk service contracts with 
many hospitals—enough to keep a team of 
radiologists busy during the night. Having a 
team continuously operating is usually more 
efficient, and allows for specialization of 
image interpretation. Teleradiology is now 
also practiced during the day to balance clini- 
cal workload and to provide specialized inter- 
pretation on a routine basis. The technology to 
rapidly transmit images across large distances 
is widely available and inexpensive. The great- 
est challenges to teleradiology are licensing/ 
credentialing issues, especially if films are read 
across state lines or internationally (radiolo- 
gists may be licensed to practice where they 
review the films but not in the location where 
the patients were located when the images 
were acquired). Teleradiology is a term that is 
not heard very often mostly because transmis- 
sion of images to the best site for interpreta- 
tion has become routine, and implemented as 
a part of the PACS network. 


22.2.4.8 Enterprise Integration 
(Including HL7, Decision 
Support) 
Medicine is an information-rich business, 
and providing access to the relevant informa- 
tion in a timely fashion is critical to success. 
Integration of systems with the relevant pieces 
of information is necessary, and in hospitals 
this is generally done with HL-7 messages (see 
> Chap. 7). Both RIS and PACS will typically 
be able to use HL7 messages, and an increasing 
number of EHRs are ‘image enabled’ mean- 
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ing that they can receive DICOM images, and 
display them for clinicians. 


22.2.4.9 Decision Support 

Because medical imaging has been a major 
component of increases in total healthcare 
costs, there has been much attention paid to 
assuring that only necessary examinations 
are performed. To help assure this, decision- 
support systems to guide proper ordering 
have been developed that have been shown 
to have an impact on utilization of imaging 
studies (Sistrom et al. 2009). Such systems 
have been shown to decrease the total number 
of examinations performed, and in particular 
to decrease the number of examinations that 
appear to be unnecessary. In addition to alert- 
ing the user to a potentially improper order, 
educational materials are often provided 
to help the ordering physician understand 
when and what imaging examinations might 
be appropriate for the given indication. In 
addition, such systems can provide manage- 
ment reports to improve the understanding of 
ordering practices. 


22.3 Imaging in Other 
Departments 


22.3.1 Cardiology 


Cardiac imaging has many similarities to 
radiological imaging, and in many cases is 
performed either by radiology departments 
or in conjunction with radiology. The pri- 
mary imaging modalities for cardiology 
include echocardiography (ultrasound), cath- 
eterization (interventional/vascular, involv- 
ing fluoroscopy and angiography, i.e., vessel 
visualization via contrast die administra- 
tion), MR, CT, and PET. The workflow can 
be similar, but can be different in those cases 
where the imaging is performed by the same 
department and even by the same individual 
as the person who ordered it. In such cases, 
there can be less formal ordering, schedul- 
ing, and reporting. However, as the imaging 
is increasingly a part of the general enter- 
prise, such informality will become a greater 
challenge. 
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Cardiology has been aggressive in its use of 
lossy compression. This is primarily because 
the nature of cardiac imaging is much more 
stereotyped (allowing better prediction of 
appropriate compression). The primary focus 
is the heart, whereas in radiology many dif- 
ferent organs with very different appearances 
can challenge compression methods. The 
echocardiographic and interventional images 
are also much more like video—usually being 
motion-oriented rather than focused on static 
capture, and that redundancy in content 
enables effective compression. In fact, the 
major cardiovascular societies have published 
and supported the use of specific compression 
technologies and settings for cardiac imaging 
(Simon et al. 1994; ACC/ACR/NEMA Ad 
hoc Group 1995). 

Because the focus is primarily on the heart 
and its function, cardiac imaging is more 
advanced in structured reporting. The pri- 
mary variables that are of interest in cardiac 
imaging (left ventricular volumes, stroke vol- 
umes) are of interest in most cases, and are 
routinely measured and reported. This has 
driven the acceptance of structured report- 
ing for the common measurements in cardiac 
imaging—particularly echocardiography. 


22.3.2 Obstetrics and Gynecology 


Obstetrical and gynecological imaging is 
rather like cardiac imaging, except that it is 
nearly always ultrasound imaging. Much like 
cardiac imaging, there are a well-defined and 
accepted set of measurements and observa- 
tions expected for the routine obstetric exam, 
and as such, structured reports for these find- 
ings are widely used. Estimates of fetal ges- 
tational age and development are typically 
automatically computed, on the basis of mea- 
surements using well-tested prediction models. 


22.3.3 Intraoperative/Endoscopic 


Visible Light 


It is now common to capture still images and 
video of endoscopic procedures as well as 
traditional open surgical procedures. These 


images are valuable for documenting the 
important findings (or lack thereof) during 
a procedure. They are also useful for edu- 
cational purposes, including informing the 
patient of the findings and procedures carried 
out. Some surgeons have suggested that medi- 
colegal demands will require routine capture 
of entire surgical procedures. 

Such images can be more graphic and 
revealing than radiological images, and in 
some cases have driven the expectation of 
need for an additional level of privacy pro- 
tection. In one institution, for instance, all 
photographic images from the Plastic Surgery 
department are protected from access—only 
physician members of the Plastic Surgery 
department can routinely view those images, 
with a process for granting temporary access 
to other care providers (Erickson et al. 2007). 


22.3.4 Pathology and Dermatology 


Pathology and dermatology have similar 
needs, except that dermatology includes pho- 
tographic visible light images of skin lesions. 
For these purposes, consumer-grade pho- 
tographs can be sufficient, but transport- 
ing those images (usually in JPEG form) to 
a medical-grade imaging system will usually 
require an import process. This process will 
require confidence about the accuracy of 
patient and site location information. Often, 
the JPEG images are then ‘wrapped’ with 
DICOM information to assure that the con- 
nection between a photograph and a patient 
exists at the file level, rather than via a link 
to a filename. Image viewers require color 
capability, but otherwise are not substantially 
different from what is provided in most radio- 
logical image viewers. If images are converted 
to DICOM, the archive system is usually able 
to store them without difficulty. 

Microscopic images represent a bigger 
challenge. At this point, there are 2 strategies 
for the capture of microscope images: the first 
is whole slide digitization in which the entire 
specimen is digitized; the other is capture of 
specific views that are of interest. In both 
cases, though more so in whole-slide imaging, 
there is a need for multi-resolution viewing. 
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That is because the workflow is very different. 
Whole-slide scanning is usually done prior to 
the pathologist reviewing the images, while the 
spot-capture is done by the pathologist at the 
time of viewing. In the former case, the com- 
puter performs the pan-zoom function, while 
in the latter case the optical microscope per- 
forms that function. In the former, the com- 
puter is a diagnostic device, while in the latter, 
it is used mostly for documentation purposes. 

Whole-slide scanning is of greater interest, 
because it has greater possibilities for improv- 
ing healthcare delivery by allowing the slides 
to go to the pathologist. It also represents a 
much greater challenge, because much larger 
amounts of data must be stored and the asso- 
ciated computer-based viewing application 
requires the ability to display low resolution 
and high resolution images. Such images are 
typically 1GB in size. Computing low resolu- 
tion images from high resolution images can 
be computational expensive. 

Another important issue differentiating 
pathology imaging from radiological imag- 
ing is that retrieval of old images is much less 
common in pathology. Follow-up of disease is 
biopsy is much less common than with radio- 
logical imaging, so comparison with priors 
is much less common. Recall would usually 
only occur when there is a medicolegal issue, 
or possibly in case of disease recurrence or 
metastasis, where there is a need to compare 
the older sample with a newer one. 


22.4 Cross-Enterprise Imaging 


22.4.1 CD Image Exchange 


When images were stored on film, sharing 
images with another hospital required that the 
films be either physically transferred or cop- 
ied. Copying a film was labor-intensive and 
expensive. Therefore, it was standard practice 
to ‘loan’ films to other facilities when needed. 

With digital images, it is much easier to 
copy the digital data onto media like a com- 
pact disc (CD) and give that to the patient. 
There are well-accepted standards (DICOM) 
for how to store the images on a CD. However, 
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it is challenging for most hospitals to use the 
images on CDs. In some cases, the images 
can be imported into the PACS, but that can 
cause confusion about where the study was 
done, and can be challenging for the RIS to 
represent this (who ordered the CD exam, and 
where the report is located). In addition, there 
are important data integrity issues—as much 
as 0.1% of CDs have been shown to include 
images for patients other than the intended 
patient (Erickson 2011), leading to important 
health delivery and legal risks. 


22.4.2 Direct Network Image 
Exchange 


The problems with CD image exchange noted 
above, as well as the time delays and costs, have 
driven many institutions to use internet trans- 
fer mechanisms. In cases where there is a high 
volume and a high level of trust, one can estab- 
lish virtual private networks (VPNs) that allow 
secure transfer between two institutions. While 
this allows rapid and low-cost transfer, it still 
requires confident patient identification, and 
a method for importing the images into some 
form of clinical viewer. It also requires a hospi- 
tal set up an extensive network of VPNs, which 
is something that can be difficult to secure. 

This has led to the creation of the ‘image 
sharing’ industry. This industry has developed 
internet tools that allow images to be securely 
transferred from one hospital to another. The 
transport mechanism is usually proprietary, 
so transfer between hospitals is most efficient 
if both use the same vendor system. However, 
nearly all provide a ‘Dropbox’-like method 
where a weblink can be used to either send 
or receive images. There are also efforts to 
develop standard interchange methods that 
are both efficient and secure. 


22.5 Future Directions for Imaging 
Systems 


The increasing capabilities of mobile devices 
and the increasing expectations of ready 
access to medical professionals have driven 
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imaging onto mobile devices. At present, the 
FDA has limited the use of such devices for 
diagnosis. On the other hand, these devices 
can be extremely useful for consultation on 
specific areas of an image, or when therapeu- 
tic options are being considered, or for patient 
communication. As the bandwidth and dis- 
play qualities improve, these devices will likely 
play an increasing role in both diagnosis and 
therapy planning. 

Cloud technology is becoming an impor- 
tant technology for image archival and trans- 
fer. The ability to leverage efficiencies of scale 
is an important economic driver that is push- 
ing many smaller imaging providers to use 
cloud-based storage. The use of cloud meth- 
ods for image exchange was also described 
above. The increasing use of computation 
intensive diagnostic aids has also driven the 
use of cloud-based CAD tools. 

Phenome characterization (see » Chap. 
26) is becoming an important aspect of the 
move to individualizing medicine sometimes 
referred to as ‘precision medicine’. Deep 
learning is a technology that will likely cause 
rapid advancement, as it allows both rapid 
automated segmentation of all major organs 
in an image, but also characterization of tis- 
sue textures. The ability to predict genomic 
information from such images will likely pro- 
duce a new generation of ‘precision imaging’ 
applications. 


© Suggested Readings 

Birkfelliner, W. (2010). Applied medical image 
processing: A basic course. CRC Press. As the 
title says, this is an introductory book, with 
many excellent explanations and example 
code (mostly MatLab). 

Branstetter, B. F. (2009). Practical imaging infor- 
matics. New York: Springer. As its title implies, 
this book is a practice-oriented book primarily 
aimed at those responsible for implementing 
and maintaining a digital imaging practice. The 
format of the book is an outline with many 
practical tips from a wide variety of experts. 

Dougherty, G. (2009). Digital image processing 
for medical applications. Cambridge 
University Press. This is an excellent, practical 
book on concepts of image processing algo- 
rithms used in medical imaging. 


Dreyer, K. J. (2006). PACS: A guide to the digital 
revolution. New York: Springer. This is also a 
book focused on the practical aspects of 
implementing and maintaining a digital imag- 
ing department. Its format is that of a tradi- 
tional textbook, and covers a broad range of 
topics. 

Liu, Y., & Wang, J. (2011). PACS and Digital 
Medicine. Boca Raton: CRC Press. This book 
goes into greater detail of the technology of 
PACS, and to a lesser degree RIS and 
EMR. This is a very good resource for those 
interested in more details of DICOM and how 
a PACS can be configured to address specific 
needs. 


(?) Questions for Discussion 

1. What are the Pros and Cons of a highly 
structured technology like DICOM? 
DICOM has been highly successful in 
terms of adoption as a standard, and 
virtually all image communication 
utilizes it. This differs markedly from 
some other standards. What are factors 
that have contributed to this success, 
and what lessons can be drawn from 
this in terms of how to promote 
adoption of standards in the future? 

2. If one were to design medical imaging 
systems today, would the optimal design 
continue to have PACS and RIS as sepa- 
rate systems, or would they be combined 
into one system? Should these be sepa- 
rate from the EHR? 

3. What are the ways in which radiology 
reports of examination interpretations 
can be generated, and what are the 
advantages and disadvantages of each 
approach, in terms of ease and efficiency 
of report creation, timeliness of avail- 
ability of report to clinicians, usefulness 
for retrieval of cases for research and 
education? 

4. In these days of high bandwidth and low 
storage costs, is there still a good reason 
to use lossy compression in medical 
imaging? What kinds of trends are likely 
to affect image growth, as part of the 
patient’s medical record? 

5. What are the arguments for maintaining 
raw rather than compressed data (not 
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only for imaging data but for compres- 
sion or summarization of other types of 
data)? 

6. Describe a classification of ways in 
which image data are used in medical 
decision making. 

7. What are the data management 
implications of using a separate 
advanced visualization system for 
clinicians that is distinct from the 


PACS used by radiologists for 
interpretations? What if the 
radiologists use that system in addition 
to the PACS? 
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(e) Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What types of online knowledge-based 
information are available and useful to 
clinicians, biomedical researchers, and 
consumers? 

= What are the major components of the 
information retrieval process? 

= What are the major categories of avail- 
able knowledge-based information? 

= How do techniques differ for indexing 
various types of knowledge-based bio- 
medical information? 

= What are the major approaches to 
retrieval of knowledge-based biomedi- 
cal information? 

= How effectively do searchers use infor- 
mation retrieval systems? 

= What are the important research direc- 
tions in information retrieval? 

= What are the major challenges to mak- 
ing digital libraries effective for health 
and biomedical users? 


23.1 Introduction 
Information retrieval (IR), sometimes called 
search, is the field concerned with the acquisi- 
tion, organization, and searching of 
knowledge-based information (Hersh 2020). 
Although biomedical IR has traditionally 
concentrated on the retrieval of text from the 
biomedical literature, the domain over which 
IR can be effectively applied has broadened 
considerably with the advent of multimedia 
publishing and vast storehouses of images, 
video, chemical structures, gene and protein 
sequences, and a wide range of other digital 
material and artifacts of relevance to biomed- 
ical education, research, and patient care. 
With the proliferation of IR systems and 
online content, the notion of the library has 
changed substantially, and new digital librar- 
ies have emerged (Lindberg and Humphreys 
2005). 

IR systems and digital libraries histori- 
cally existed to store and disseminate 
knowledge-based information. What exactly 
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does that mean? Although there are many 
ways to classify biomedical information and 
the informatics applications that use them, in 
this chapter we will broadly divide them into 
two categories. Patient-specific information 
applies to individual patients. Its purpose is to 
document and increasingly analyze for health 
care providers, administrators, and research- 
ers about the health and disease of a patient. 
This information historically came from the 
patient’s medical record but now can come 
from many different sources, including mobile 
and wearable devices. The second category of 
biomedical information is knowledge-based 
information. This is information that has been 
derived and organized from observational or 
experimental research. In the case of clinical 
research, this information provides clinicians, 
administrators, and researchers with knowl- 
edge derived from experiments and observa- 
tions, which can then be applied to individual 
patients. This information has historically 
been provided in books and journals but can 
take a wide variety of other forms, including 
clinical practice guidelines, consumer health 
literature, Web sites, and so forth. The distinc- 
tion between these two types of information is 
blurred by the growing amount of data that 
comes from people and is used to derive 
knowledge. 

A basic overview of the IR process is 
shown in @ Fig. 23.1 and forms the basis for 
most of this chapter. The overall goal of IR or 
search is to find content that meets a person’s 


Metadata 


ee 


Content 


he a 


Search 

engine 
O Fig. 23.1 Basic overview of the information retrieval 
process. Retrieval is made possible via metadata, which 
is produced via indexing and applied in queries by users. 


The metadata is used by the search engine, which directs 
the user to the content 
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information needs. This is done by posing a 
query to the IR system. A search engine 
matches the query to content items through 
metadata, which is “data about data” that 
describes the content items (Foulonneau and 
Riley 2008). There are two intellectual pro- 
cesses of IR. Indexing is the process of assign- 
ing metadata to content items, while retrieval 
is the process of the user entering his or her 
query and retrieving content items. 


23.2 Evolution of Biomedical 
Information Retrieval 


As with many chapters in this volume, IR has 
changed substantially over the five editions of 
this book. In the first edition, this chapter was 
titled  “Bibliographic-Retrieval Systems,” 
reflecting the predominant type of knowledge 
that was accessible at the time. The second edi- 
tion saw the emergence of the World Wide 
Web (WWW or Web) as a delivery mechanism 
for knowledge-based information. In the third 
edition, “Digital Libraries” was added to the 
chapter name, reflecting that the entire bio- 
medical library and beyond was now part of 
available online knowledge. The fourth edition 
reflected the ubiquitous nature of information 
on computers, smartphones, tablets, and other 
devices. This fifth edition recognizes that digi- 
tal data, information, and knowledge have 
become their primary medium, i.e., even 
though the world still has plenty of paper, the 
source of most information in modern times is 
digital. Essentially all articles, books, patient 
records, etc. are primarily in digital form and 
mainly printed for the convenience of reading. 

Although this chapter focuses on the use 
of computers to facilitate IR, methods for 
finding and retrieving information from medi- 
cal sources have been in existence for nearly a 
century and a half. In 1879 Dr. John Shaw 
Billings created Index Medicus to help medi- 
cal professionals find relevant journal articles 
(DeBakey 1991). Journal article citations were 
indexed by author name(s) and subject 
heading(s) and then aggregated in bound vol- 
umes. A scientist or practitioner seeking an 
article on a topic could manually search the 


index for the single best-matching subject 
heading and then be directed to citations of 
published articles. 

The printed Index Medicus served as the 
main biomedical IR source until 1971, when 
the National Library of Medicine (NLM)1 
unveiled an electronic version, the Medical 
Literature Analysis and Retrieval System 
(MEDLARS), which had been cataloging 
bibliographic records since 1966 (Miles 1982). 
Because computing power and disk storage 
were very limited, MEDLARS and its follow- 
on MEDLARS Online (MEDLINE), stored 
only limited information for each article, such 
as author name(s), article title, journal source, 
and publication date. In addition, the NLM 
assigned to each article a number of terms 
from its Medical Subject Headings (MeSH) 
vocabulary. Searching was done by users hav- 
ing to mail a paper search form to the NLM 
and receiving results back a few weeks later. 
Only librarians who had completed a special- 
ized course were allowed to submit searches. 

As computing power grew and disk stor- 
age became more plentiful in the 1980s, full- 
text databases began to emerge. These new 
databases allowed searching of the entire text 
of medical documents. Although lacking 
graphics, images, and tables from the original 
source, these databases made it possible to 
retrieve the full text of important documents 
quickly from remote locations. Likewise, with 
the growth of computer networks, end users 
were now allowed to search the databases 
directly, though at a substantial cost. 

In the early 1990s, the pace of change in 
the IR field quickened. The advent of the 
Web and the exponentially increasing power 
of computers and networks brought a world 
where vast quantities of medical information 
from multiple sources with various media 
extensions were now available over the global 
Internet (Berners-Lee et al. 1994). In the late 
1990s, the NLM made all of its databases 
available to the entire world for free. Also 
during this time, the notion of digital librar- 
ies developed, with the recognition that the 
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entire array of knowledge-based information 
could be accessed using this technology 
(Borgman 1999). 

Into the twenty-first century, use of IR 
systems and digital libraries has become ubiq- 
uitous. Estimates vary, but among individuals 
who use the Internet in the United States, over 
80% have used it to search for information 
relevant to their own health or that of an 
acquaintance (Fox 2011; Taylor 2010). 
Virtually all physicians use the Internet 
(Anonymous 2012). Furthermore, access to 
systems has gone beyond the traditional per- 
sonal computer and extended to new devices, 
such as smartphones and tablet devices. 


23.3 Knowledge-Based 
Information in Health 
and Biomedicine 


Knowledge-based information can be subdi- 
vided into two categories. Primary knowl- 
edge-based information (also called primary 
literature) is original research that appears in 
journals, books, reports, and other sources. 
This type of information reports the initial 
discovery of health knowledge, usually with 
either original data or reanalysis of data (e.g., 
systematic reviews, sometimes with meta- 
analysis). Secondary knowledge-based infor- 
mation consists of the writing that reviews, 
condenses, and/or synthesizes the primary lit- 
erature. The most common examples of this 
type of literature are books, monographs, and 
review articles. Secondary literature includes 
the growing quality of patient/consumer- 
oriented health information that is increas- 
ingly available via the Web. It also encompasses 
opinion-based writing (such as editorials and 
position or policy papers), clinical practice 
guidelines, narrative reviews, and health infor- 
mation on Web pages. In addition, it includes 
the plethora of pocket-sized manuals that 
were formerly a staple for practitioners in 
many professional fields. As will be seen later, 
secondary literature is the most common type 
of literature used by physicians. 

Libraries have been the historical place 
where knowledge-based information has been 
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stored. Libraries actually perform a variety of 

functions, including the following: 

= Acquisition and maintenance of collec- 
tions 

= Cataloging and classification of items in 
collections to make them more accessible 
to users 

= Serving as a place where individuals can 
get assistance with seeking information, 
including information on computers 

= Providing work or study space (particu- 
larly in universities) 


Digital libraries provide some of the same ser- 
vices, but their focus tends to be on the digital 
aspects of content. 


Information Needs 
and Information Seeking 


23.3.1 


Different users of knowledge-based informa- 

tion have differing needs based on the nature 

of their information need and available 

resources. The information needs and infor- 

mation seeking of physicians have been most 

extensively studied. Gorman defined four 

states of information need in the clinical con- 

text (Gorman and Helfand 1995): 

= Unrecognized need—clinician unaware of 
information need or knowledge deficit 

= Recognized need—clinician aware of need 
but may or may not pursue it 

= Pursued need—information seeking 
occurs but may or may not be successful 

= Satisfied need—information seeking 
successful 


There is a great deal of evidence that the 
majority of information needs are not being 
satisfied and that IR applications may help. 
Among the reasons that physicians do not 
adhere to the most up-to-date clinical prac- 
tices is that they often do not recognize that 
their knowledge is incomplete. While this is 
not the only reason for such practices, the evi- 
dence is compelling. For example, physicians 
do not always provide patients with most up- 
to-date care (McGlynn et al. 2003), do not 
adhere to established guidelines (Diamond 
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and Kaul 2008), and vary widely in how they 
provide care (Wennberg 2010). 

Studies from the late twentieth century 
found that when physicians recognize an 
information need, they were likely to pursue 
only a minority of unanswered questions. 
These studies found that physicians in prac- 
tice had unmet information needs on the 
order of two questions for every three patients 
seen and only pursued answers for about 30% 
of these questions (Covell et al. 1985; Ely 
et al. 1999; Gorman and Helfand 1995). When 
answers to questions were actually pursued, 
these studies showed that the most frequent 
source for answers to questions was col- 
leagues, followed by paper-based textbooks. 
Therefore, it is not surprising that barriers to 
satisfying information needs remain (Ely et al. 
2002). Physicians use electronic sources more 
now than were measured in these earlier stud- 
ies, with the widespread use of the electronic 
health record (EHR) as well as ubiquity of 
portable smartphones and tablets, although 
less research is undertaken in modern times 
assessing their needs. Another approach to 
facilitating access to knowledge-based infor- 
mation has been to link it more directly with 
the context of the patient in the EHR (Cimino 
and delFiol 2007). 

The information needs of other users have 
been less well-studied. For consumers, surveys 
found about 80% of all Internet users searched 
for personal health information (Fox and 
Duggan 2013). The most common type of 
search focuses on a specific disease or medical 
problem (66% of all who have searched), fol- 
lowed by a specific medical treatment or pro- 
cedure (56%). Consumers also use the Web to 
search for physicians, health care institutions, 
and health insurance. Even less studied have 
been the information needs of researchers, but 
one recurrent finding is the idiosyncratic 
nature of their use of IR and other systems 
(Bartlett and Toms 2005). 


23.3.2 Changes in Publishing 


The Internet and the Web have had a pro- 
found impact on the publishing of knowledge- 
based information. The technical impediments 


to electronic publishing of journals have been 
overcome, such that virtually all scientific 
journals are published electronically now. A 
modern Internet connection is sufficient to 
deliver most of the content of journals. When 
available in electronic form, journal content is 
easier and more convenient to access. 
Furthermore, since most scientists have the 
desire for widespread dissemination of their 
work, they have incentive for their papers to 
be available electronically. 

The technical challenges to electronic 
scholarly publication have been replaced by 
economic and political ones (Hersh and 
Rindfleisch 2000; Sox 2009). Printing and 
mailing, tasks no longer needed in electronic 
publishing, comprised a significant part of the 
“added value” from publishers of journals. 
There is still however value added by publish- 
ers, such as hiring and managing editorial 
staff to produce the journals, and managing 
the peer review process. Even if publishing 
companies, as they currently exist today, were 
to vanish, there would still be some cost to the 
production of journals. Thus, while the cost 
of producing journals electronically is likely 
to be less, it is not zero, and even if journal 
content is distributed “free,” someone has to 
pay the production costs. The economic issue 
in electronic publishing, then, is who is going 
to pay for the production of journals (Sox 
2009). This introduces some political issues as 
well. One of them centers around the concern 
that much research is publicly funded through 
grants from federal agencies such as the 
National Institutes of Health (NIH) and the 
National Science Foundation (NSF). In the 
current system, especially in the biomedical 
sciences (and to a lesser extent in other sci- 
ences), researchers turn over the copyright of 
their publications to journal publishers. The 
political concern is that the public funds the 
research and the universities carry it out, but 
individuals and libraries then must buy it back 
from the publishers to whom they willingly 
cede the copyright. This problem is exacer- 
bated by the general decline in funding for 
libraries. 

One solution to this problem has been the 
emergence of open-access (OA) publishing. 
The premise of the OA model is that the 
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content is made freely available electronically, 
with the costs of production covered by other 
funding sources (Frank 2013). The most com- 
mon funding source is the research funder, 
and most funders consider OA publishing 
costs to be an allowable expense on research 
grants. While some have expressed concern 
that OA may give rise to financial incentives 
for excess publishing, many OA journals have 
over a decade of experience demonstrating 
traditional peer review is compatible with OA 
publishing. Another concern has been the 
ability of resource-poor scientists to afford 
OA publishing, although most OA journals 
have “hardship” policies that allow waiver of 
the publication fee. 

There has been the emergence of two mod- 
els of OA publishing (Frank 2013). One is OA 
Gold, where the author (usually through the 
research funding) pays the cost of production. 
Publishing charges are typically only a tiny 
fraction of the overall cost of research, esti- 
mated to be about 0.3% (Zerhouni 2004). Two 
early publishers operating under this model 
were Biomed Central (BMC, now owned by 
Springer)” and the Public Library of Science 
(PLoS).” Many commercial publishers now 
offer authors the ability to publish under OA, 
and some journals have developed “Open” sis- 
ter journals, such as JAMA, BMJ, and 
JAMIA. 

A second model is OA Green, where 
authors are required to deposit the manu- 
script, either the published manuscript or the 
last draft of the manuscript prior to typeset- 
ting, into public repositories such as PubMed 
Central (PMC).* PMC is a repository of life 
science research that provides free access while 
allowing publishers to maintain copyright 
and even optionally keep the papers housed 
on their own servers. A lag time of up to 
6 months is allowed so that journals can reap 
the revenue that comes with initial publica- 
tion. The National Institutes of Health 
(NIH)5 now requires all research funded by 


> https://www.biomedcentral.com/ 
> https://www.plos.org/ 

> https://www.ncbi.nlm.nih.gov/pme/ 
> https://www.nih.gov/ 
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its grants to be submitted to PMC, either in 
the form published by publishers or as a PDF 
of the last manuscript prior to journal accep- 
tance. Publishers have expressed concern that 
copyrights give journals more control over the 
integrity of the papers they publish (Drazen 
and Curfman 2004). An alternative approach, 
advocated by non-commercial (professional 
society) publishers is the Washington DC 
Principles for Free Access to Science,’ which 
advocates: 
= Reinvestment of revenues in support of 
science. 
= Use of open archives such as PMC as 
allowed by business constraints. 
= Commitment to some free publication, 
access by low-income countries, and no 
charges to publish. 


One adverse outcome of OA publishing has 
been the emergence of so-called predatory 
journals (Haug 2013). These journals exist 
mainly to make money, allowing authors to 
publish in impressive-sounding titles with lit- 
tle or no peer review. Predatory journals offer 
inexpensive publishing and send massive 
emails to academic faculty members offering 
to publish or even serve on editorial boards. 
Some authors have exposed the process by 
writing clearly fake papers (McCool 2017), 
while others have issued calls for efforts to 
stop it (Moher and Moher 2016). 
Some consider OA to be part of larger 
“open science,” consisting of (Anonymous 
2018b): 
= Open data — all data collected in research 
= Open source — all software code developed 
and used 
= Open methodology — clear and detailed 
description; availability of all surveys and 
other tools 

= Open peer review — all comments of peer 
reviewers 


The move to predominant electronic publica- 
tion of science has led many to advocate for 
growing access to the underlying data from 


6 > https://publicaccess.nih.gov/ 
7 » http://www.dcprinciples.org/ 
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research studies. While this has been a stan- 
dard practice in genomics and related areas 
for many years (i.e, depositing genome 
sequences in GenBank as a condition of pub- 
lication), there have been concerns that have 
impeded this approach in clinical studies. 

The case for requiring publication of scien- 
tific data is strong. As most research is taxpayer- 
funded, it only seems fair that those who paid 
are entitled to all the data for which they paid. 
Likewise, all of the subjects were real people 
who potentially took risks to participate in the 
research, and their data should be used for dis- 
covery of knowledge to the fullest extent pos- 
sible. In addition, new discoveries may emerge 
from re-analysis of data. For these reasons, the 
International Committee of Medical Journal 
Editors (ICMJE) called for de-identified data 
from randomized controlled trials to be shared 
as condition of publication (Taichman et al. 
2017). Other have called for data access to be 
FAIR (finable, accessible, interoperable, and 
reusable) (Wilkinson et al. 2016). 

Some researchers, however, have pushed 
back on this notion. They argue that those 
who carry out the work of designing, imple- 
menting, and evaluating experiments certainly 
have some exclusive rights to the data gener- 
ated by their work. Some also question 
whether the cost is a good expenditure of lim- 
ited research dollars, especially since the 
demand for such data sets may be modest and 
the benefit is not clear. One group of 282 
researchers in 33 countries, the International 
Consortium of Investigators for Fairness in 
Trial Data Sharing, notes that there are risks, 
such as misleading or inaccurate analyses as 
well as efforts aimed at discrediting or under- 
mining the original research (Anonymous 
2016). They also express concern about the 
costs, given that there are over 27,000 RCTs 
performed each year. As such, this group calls 
for an embargo on reuse of data for 2 years 
plus another half-year for each year of the 
length of the RCT. Even those who support 
data sharing point out the requirement for 
proper curation, wide availability to all 
researchers, and appropriate credit to and 
involvement of those who originally obtained 
the data (Merson et al. 2016). 


23.3.3 Quality of Information 


In the early days of the growth of the Internet 
and the Web, another concern was the quality 
of information available. A large fraction of 
Web-based health information has always 
been aimed at nonprofessional audiences. 
Many lauded this development as empower- 
ing those most directly affected by health 
care—those who consumed it (Eysenbach 
et al. 1999). Others expressed concern about 
patients misunderstanding or being purposely 
misled by incorrect or inappropriately inter- 
preted information (Jadad 1999). Some clini- 
cians also lamented the growing amount of 
time required to go through stacks of print- 
outs downloaded by patients and brought to 
their offices. The Web was inherently demo- 
cratic, allowing anyone to post information. 
However, this was potentially at odds with the 
operation of a professional field, particularly 
one like health care, where practitioners were 
ethically bound and legally required to adhere 
to the highest standard of care. Thus, a major 
concern with health information on the Web 
is the presence of inaccurate or out-of-date 
information. An early systematic review of 
studies assessing the quality of health infor- 
mation found that 55 of 79 studies came to 
the conclusion that quality of information 
was a problem (Eysenbach et al. 2002). More 
recent studies continue to show the variable 
quality of health information on the Web 
(Kitchens et al. 2014). 

A more recent problem, extending beyond 
health care, has been the proliferation of 
“fake news” and “alternative facts,” often pro- 
mulgated by those understanding how to 
manipulate search engines and social media 
(Vosoughi et al. 2018; Wenzel 2017). A related 
area is “adversarial” IR, where the goal is to 
not retrieve information the user does not 
want to see or should not see (Castillo and 
Davison 2011). One major research organiza- 
tion has lamented the larger societal impacts 
of such “truth decay,” lamenting “erosion of 
civil discourse, political paralysis, alienation 
and disengagement of individuals from politi- 
cal and civic institutions” (Kavanagh and 
Rich 2018). 
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The impact of poor-quality health infor- 
mation is unclear. People were harmed by 
incorrect and misleading health information 
long before the emergence of digital informa- 
tion. One well-known self-help expert argued 
that patients and consumers actually are savvy 
enough to understand the limits of quality of 
information on the Web. This view held that 
patients and consumers should be trusted to 
discern quality using their own abilities to 
consult different sources of information and 
to communicate with health care practitioners 
and with others who share their condition(s) 
(Ferguson 2002). Indeed, the ideal situation 
may be a partnership among patients and 
their health care practitioners, as it has been 
shown that patients desire that their practitio- 
ners be the primary source of recommenda- 
tions for online information (Tang et al. 1997). 

This concern about quality of information 
led a number of individuals and organizations 
to develop guidelines for assessing the quality 
of health information. One of the earliest and 
most widely quoted set of criteria was pub- 
lished in JAMA (Silberg et al. 1997). These 
criteria stated that Web pages should contain 
the name, affiliation, and credentials of the 
author; references to the claims made; explicit 
listing of any perceived or real conflict of 
interest; and date of most recent update. 
Another early set of criteria was the Health on 
the Net (HON)® codes, a set of voluntary 
codes of conduct for health-related Web sites. 
Sites that adhere to the HON codes can dis- 
play the HON logo. Another approach to 
insuring Web site quality is accreditation by a 
third party. URAC (formerly, the Utilization 
Review Accreditation Commission) has a pro- 
cess for such accreditation.? The URAC stan- 
dards cover six general issues: health content 
editorial process, disclosure of financial rela- 
tionships, linking to other Web sites, privacy 
and security, consumer complaint mecha- 
nisms, and internal processes required to 
maintain quality over time. 


8 » https://www.hon.ch/en/ 
9 > https://www.urac.org/programs/health-web-site- 
accreditation 


763 


23.3.4 Evidence-Based Medicine 


The growing quantity of clinical information 
available in IR systems and digital libraries 
requires new approaches to select that which 
is best to use for clinical decisions. The phi- 
losophy guiding this approach is evidence- 
based medicine (EBM), which can be viewed a 
set of tools to inform clinical decision-making. 
It allows clinical experience (“art”) to be inte- 
grated with best clinical science (Guyatt et al. 
2014, 2015). Also, EBM makes the medical 
literature more clinically applicable and rele- 
vant. In addition, it requires the user to be fac- 
ile with computers and IR systems. The 
process of EBM involves three general steps: 
= Phrasing a clinical question that is perti- 
nent and answerable. 
= Identifying evidence (studies in articles) 
that address the question. 
= Critically appraising the evidence to deter- 
mine whether it applies to the patient. 


The phrasing of the clinical question is an 
often-overlooked portion of the EBM pro- 
cess. There are two general types of clinical 
question: background questions and fore- 
ground questions (Guyatt et al. 2014, 2015). 
Background questions ask for general knowl- 
edge about a disorder, whereas foreground 
questions ask for knowledge about managing 
patients with a disorder. Background ques- 
tions are generally best answered with text- 
books and classical review articles, whereas 
foreground questions are answered using 
EBM techniques. There are four major fore- 
ground question categories: 

= Therapy (or intervention)—benefit of 
treatment or prevention. 

Diagnosis—test diagnosing disease. 
Harm—detrimental health effects of a dis- 
ease, environmental exposure (natural or 
man-made), or medical intervention. 

= Prognosis—outcome of disease course. 


Identifying evidence involves selecting the 
best evidence for a given type of question. 
EBM proponents advocate, for example, that 
randomized controlled trials or a systematic 
review (with or without meta-analysis) that 
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combines multiple trials provide the best evi- 
dence for or against particular health care 
interventions. Likewise, diagnostic test accu- 
racy is best assessed with comparison to a 
known gold standard in an appropriate spec- 
trum of patients to whom the test will be 
applied (see > Chap. 3). Questions of harm 
can be answered by randomized controlled 
trials when it is ethical to do so; otherwise 
they are best answered with observational 
case control or cohort studies. There are 
checklists of attributes for these different 
types of studies that allow their critical 
appraisal and applicability to a given patient 
in the EBM resources described above. 

The original approach to EBM has evolved 
over time, with less emphasis on critically 
appraising original evidence and more on syn- 
thesized evidence being made readily available 
to clinicians, usually through electronic 
sources, including clinical decision support 
systems (see > Chap. 26) (DiCenso et al. 
2009; Hersh 1999). There have also been a 
number of criticisms of EBM, with argu- 
ments that it may ignore clinical expertise 
(Haynes et al. 2002), patient values (Guyatt 
et al. 2004), and other ways of “knowing” 
(Sim 2016). Others express concerns that its 
methods have been “distorted” (Greenhalgh 
et al. 2014), “hijacked” by commercial forces 
and other self-interest (Ioannidis 2016a, 
2017), and subverted by the political process 
(Patashnik et al. 2017). A final criticism is the 
proliferation of systematic reviews, not all of 
which may be motivated by objective science 
(Ioannidis 2016b). 


23.4 Content of Knowledge-Based 
Information Resources 


The previous sections of this chapter have 
described some of the issues and concerns 
surrounding the production and use of 
knowledge-based information in biomedicine. 
It is useful to classify the information to gain 
a better understanding of its structure and 
function. In this section, we classify content 
into bibliographic, full-text, annotated, and 
aggregated categories, although some content 
does not neatly fit within them. 


23.4.1 Bibliographic Content 


The first category consists of bibliographic 
content. It includes what was for decades the 
mainstay of IR systems: literature reference 
databases. Also called bibliographic data- 
bases, this content consists of citations or 
pointers to the medical literature (i.e., journal 
articles). The best-known and most widely 
used biomedical bibliographic database is 
MEDLINE, which contains bibliographic ref- 
erences to all of the biomedical articles, edito- 
rials, and letters to the editors in approximately 
5000 scientific journals. The journals are cho- 
sen for inclusion by an advisory committee of 
subject experts convened by NIH. At present, 
over 800,000 references are added to 
MEDLINE yearly. It contained over 24 mil- 
lion references by the end of 2017.!° 

The current MEDLINE record contains 
over 60 fields.'! A clinician may be interested 
in just a handful of these fields, such as the 
title, abstract, and indexing terms. But other 
fields contain specific information that may be 
of great importance to other audiences. The 
Supplementary Information (SI) field con- 
tains links to records in many different data 
banks, from clinical trials registries to genom- 
ics and other “omics” databases.'* Even the 
clinician may, however, derive benefit from 
some of the other fields. For example, the 
Publication Type (PT) field can help in the 
application of EBM, such as when one is 
searching for a practice guideline or a ran- 
domized controlled trial. 

MEDLINE records are also assigned 
other identifiers. The PubMed ID (PMID) isa 
unique identifier for records in the database. 
Another identifier in MEDLINE is the 
PubMed Central ID (PMCID), assigned to 
records whose articles have been deposited 
into PMC. MEDLINE also contains an 
Author Identifier (AUID) field, which allows 
three possible identifiers, the most common of 


10 » https://www.nlm.nih.gov/bsd/index_stats_comp. 
html 

11 > https://www.nlm.nih.gov/bsd/mms/medlineele- 
ments.html 

12 » https://www.nlm.nih.gov/bsd/medline_data- 
bank_source.html 
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which is the ORCID identifier,'” a unique 
identifier for scientific authors (e.g., 0000- 
0002-4114-5148 for this author). A growing 
number of journals and other publications 
make use of the ORCID. 

MEDLINE can be accessed without 
charge via the PubMed system, !* produced by 
the National Center for Biotechnology 
Information (NCBI)!* of the NLM. Some 
other information vendors, such as Ovid 
Technologies,!° license the content of 
MEDLINE and other databases and provide 
value-added services that can be accessed for a 
fee by individuals and institutions. 

The NLM used to offer a number of other, 
more focused bibliographic databases, but 
these have all been folded into MEDLINE. A 
number of other publishers offer biomedical 
bibliographic databases. The major non-NLM 
database for the nursing field is the Cumulative 
Index to Nursing and Allied Health Literature 
(CINAHL),"’ which covers nursing and allied 
health literature, including physical therapy, 
occupational therapy, laboratory technology, 
health education, physician assistants, and 
medical records. Another well-known biblio- 
graphic database is EMBASE, !8 which is pro- 
duced by the commercial publisher, Elsevier 
(Amsterdam, Netherlands). EMBASE con- 
tains over 28 million records and covers a 
superset of journals from MEDLINE. These 
journals are often important for those carry- 
ing out systematic reviews and meta-analyses, 
which need access to all the studies published 
across the world. 

A second, more modern type of biblio- 
graphic content is the Web catalog. There are 
increasing numbers of such catalogs, which 
consist of Web pages containing mainly links 
to other Web pages and sites. It should be 
noted that there is a blurry distinction between 
Web catalogs and aggregations (the fourth 


13 » https://orcid.org/ 

14 > https://pubmed.gov 

15 > https://www.ncbi.nlm.nih.gov/ 

16 » http://ovid.com/site/index.jsp 

17 » https://health.ebsco.com/products/the-cinahl- 
database 

18 » https://www.elsevier.com/solutions/embase-bio- 
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category; see > Sect. 23.4.4, below). In gen- 

eral, the former contains only links to other 

pages and sites, while the latter include actual 

content that is highly integrated with other 

resources. Some well-known Web catalogs 

include: 

= HON Select!’—a European catalog of 
quality-filtered, clinician-oriented Web 
content from the HON foundation. 

= Translating Research into Practice 
(TRIP)°—a database of content deemed 
to meet high standards of EBM. 


A specialized registry specific to healthcare 
was the National Guidelines Clearinghouse 
(NGC). Produced by the Agency for 
Healthcare Research and Quality (AHRQ), it 
contained exhaustive information about clini- 
cal practice guidelines. In 2018, AHRQ shut 
down the National Guidelines Clearinghouse 
(Munn and Qaseem 2018). The original con- 
tractor developing the NGC was the non- 
profit research firm, ECRI, which developed a 
new site, ECRI Guidelines Trust.”! 

A final kind of bibliographic-like content 
consists of RSS feeds (originally RDF Site 
Summary, often dubbed “Really Simple 
Syndication”), which are short summaries of 
Web content, typically news, journal articles, 
blog postings, and other content. Users set up 
an RSS aggregator, which can be though a 
Web browser, email client, or standalone soft- 
ware, configured for the RSS feed desired, 
with an option to add a filter for specific con- 
tent. There are two versions of RSS (1.0 and 
2.0) but both provide: 
= Title—name of item 
= Link—URL to content 
= Description—a brief description of the 

content 


23.4.2 Full-text Content 


The second type of content is full-text con- 
tent. A large component of this content origi- 


19 » https://www.hon.ch/HONselect 
20 » https://www.tripdatabase.com 
21 » https://guidelines.ecri.org/ 
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nally consisted of the online versions of books 
and periodicals. As already noted, just about 
all traditionally paper-based medical content, 
from journals to textbooks, is now available 
electronically. The electronic versions usually 
have identical content to paper versions but 
may be enhanced by measures ranging from 
the provision of supplemental data in a jour- 
nal article to linkages and multimedia content 
in a textbook. The final component of this 
category is the Web site. Admittedly, the diver- 
sity of information on Web sites is enormous, 
and sites may include every other type of con- 
tent described in this chapter. However, in the 
context of this category, “Web site” refers to 
the vast number of static and dynamic Web 
pages at a discrete Web location. 

One of the fields in MEDLINE is the uni- 
form resource locator (URL) for the publish- 
er’s full text of the article, allowing linkage 
directly from the bibliographic database to the 
full text. This link is active when the PubMed 
record is displayed, but users may be met by a 
password screen if the article is not available 
for free. Many sites allow both access to sub- 
scribers or a pay-per-view facility. Many aca- 
demic organizations now maintain large 
numbers of subscriptions to journals available 
to faculty, staff, and students. Other publish- 
ers, such as Ovid, provide access within their 
own password-protected interfaces to articles 
from journals that they have licensed for use 
in their systems. 

The most common secondary literature 
source is textbooks, almost all of which are 
available in electronic form. A common 
approach with textbooks is bundling them, 
sometimes with linkages across the bundled 
texts. An early bundler of textbooks was Stat!- 
Ref?” that, like many, began as a CD-ROM 
product and then moved to the Web. Most 
other large publishers have now similarly 
aggregated their libraries of textbooks and 
other content. Another collection of text- 
books is the NCBI Bookshelf, which con- 
tains many volumes on biomedical research 
topics. Initially published by NCBI but now a 


22 » http://statref.com/ 
23 » https://www.ncbi.nlm.nih.gov/books 


standalone reference is Online Mendelian 
Inheritance in Man (OMIM),”* which is con- 
tinually updated with new information about 
the genomic causes of human disease. 

Electronic textbooks offer additional fea- 
tures beyond text from the print version. 
While many print textbooks do feature high- 
quality images, electronic versions offer the 
ability to have more pictures and illustrations. 
They also have the ability to use sound and 
video, although few do at this time. As with 
full-text journals, electronic textbooks can 
link to other resources, including journal ref- 
erences and the full articles. Many Web-based 
textbook sites also provide access to continu- 
ing education self-assessment questions and 
medical news. Finally, electronic textbooks let 
authors and publishers provide more frequent 
updates of the information than is allowed by 
the usual cycle of print editions, where new 
versions come out only every few years. 

As noted above, Web sites are another 
form of full-text information. Probably the 
most effective provider of Web-based health 
information is the U.S. government. Not only 
do they produce bibliographic databases, but 
the NLM, AHRQ, the National Cancer 
Institute (NCI), Centers for Disease Control 
(CDC), and others have also been innovative 
in providing comprehensive full-text informa- 
tion for health care providers and consumers. 
One example is the popular CDC Travel site.”° 
Some of these will be described later as aggre- 
gations, since they provide many different 
types of resources. 

A large number of commercial biomedical 
and health Web sites have emerged in recent 
years. On the consumer side, they include 
more than just collections of text; they also 
include interaction with experts, online stores, 
and catalogs of links to other sites. Some well- 
known examples include Mayo Clinic” and 
WebMD.?’ There are also Web sites, either 
from medical professional societies or compa- 
nies, which provide information geared toward 
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health care providers, typically overviews of 
diseases, their diagnosis, and treatment; medi- 
cal news and other resources for providers are 
often offered as well. 

Other sources of on-line health-related 
content include encyclopedias, the so-called 
body of knowledge (BOK; the complete set of 
concepts, terms and activities that make up a 
professional domain), and Weblogs or blogs. 
A well-known online encyclopedia with a 
great deal of health-related information is 
Wikipedia,’ which features a distributed 
authorship process whose content has been 
found to reliable (Giles 2005; Nicholson 2006) 
and frequently shows up near the top in 
health-related Web searches (Laurent and 
Vickers 2009). A growing number of organi- 
zations have a body of knowledge, such as the 
American Health Information Management 
Association (AHIMA)” Blogs tend to carry a 
stream of consciousness but often high- 
quality information is posted within them. 


23.4.3 Annotated Content 


The third category consists of annotated con- 

tent. These resources are usually not stored as 

freestanding Web pages but instead are often 

housed in database management systems. 

This content can be further subcategorized 

into discrete information types: 

= Image databases—collections of images 
from radiology, pathology, and other areas 

= Genomics databases—information from 
gene sequencing, protein characterization, 
and other genomic research 

= Citation databases—bibliographic link- 
ages of scientific literature 

= EBM databases—highly structured collec- 
tions of clinical evidence 

= Other databases—miscellaneous 
collections 


other 


A great number of biomedical image data- 
bases are available on the Web. Some exam- 
ples from the NLM include: 
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= Visible Human Project*” — collection of 
three-dimensional representations of nor- 
mal male and female bodies, consisting of 
cross-sectional slices of cadavers, with sec- 
tions of 1 mm thickness in the male and 
0.3 mm thickness in the female (Spitzer 
et al. 1996). Also available from each cadaver 
are transverse computerized tomography 
and magnetic resonance images. 

= Images from the History of Medicine*! — 
online access to images from the historical 
collections of the NLM. 

= Open-I*? — collection of images from 
PubMed Central papers (Demner- 
Fushman et al. 2012). 


Many genomics databases are available on the 
Web. The first issue each year of the journal 
Nucleic Acids Research (NAR) catalogs and 
describes these databases, and is now available 
by open access means (Rigden and Fernandez 
2020). NAR also maintains an ongoing data- 
base of such databases, the Molecular Biology 
Database Collection. Among the most 
important of these databases are those avail- 
able from NCBI (Anonymous 2018a). All 
their databases are linked among themselves, 
along with PubMed and OMIM, and are 
searchable via the NCBI Search system.** 
More details on the specific content of genom- 
ics databases is provided in > Chap. 28. 
Citation databases provide linkages to 
articles that cite others across the scientific lit- 
erature. The earliest citation databases were 
the Science Citation Index (SCI) and Social 
Science Citation Index (SSCI), which are now 
part of the larger Web of Science (Clarivate 
Analytics, Philadelphia, PA). Two well-known 
bibliographic databases for biomedical and 
health topics that also have citation links 
include SCOPUS35 and Google Scholar.*° A 
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final citation database of note is CiteSeer,’ 
which focuses on computer and information 
science, including biomedical informatics. 
EBM databases are devoted to providing 
annotated evidence-based information. Some 
examples (all available with through subscrip- 
tion fees) include: 
= Cochrane Database of Systematic 
Reviews—one of the original collections 
of systematic reviews38 
BMJ Best Practice” 
= JAMA Evidence” 
Up-to-Date—content 
clinical questions“! 
= Essential Evidence Plus”? 


centered around 


There is a growing market for a related type of 
evidence-based content in the form of clinical 
decision support order sets, rules, and health/ 
disease management templates. Publishers 
include EHR vendors whose systems employ 
this content as well as other vendors such as 
Zynx* and Provation.* 

There are a variety of other annotated 
content. The ClinicalTrials.gov database* 
began as a database of clinical trials spon- 
sored by NIH. After concerns about clinical 
trials having their protocols altered after the 
start of trials, ClinicalTrials.gov expanded its 
scope to be a registry of all clinical trials 
(DeAngelis et al. 2005; Zarin et al. 2017) and 
to contain actual results of trials (Zarin et al. 
2011). Another important database for 
researchers is NIH RePORTER,” which is a 
database of all grant awards funded by 
NIH. An additional annotated resource is 
DataMed,*' which aims to catalog and pro- 
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vide linkage for use of data sets from biomed- 
ical research (Ohno-Machado et al. 2017). 


23.4.4 Aggregated Content 


The final category consists of aggregations of 
content from the first three categories. The 
distinction between this category and some of 
the highly-linked types of content described 
above is admittedly blurry, but aggregations 
typically have a wide variety of different types 
of information serving the diverse needs of 
users. Aggregated content has been developed 
for all types of users from consumers to clini- 
cians to scientists. 

Probably the largest aggregated consumer 
information resource is MedlinePlus‘ from 
the NLM. MedlinePlus includes all of the 
types of content previously described, aggre- 
gated for easy access to a given topic. 
MedlinePlus contains health topics, drug 
information, medical dictionaries, directories, 
and other resources. Each topic contains links 
to health information from the NIH and other 
sources deemed credible by its selectors. There 
are also links to current health news (updated 
daily), a medical encyclopedia, drug refer- 
ences, and directories, along with a preformed 
PubMed search related to the topic. 

Another well-known group of aggrega- 
tions of content for genomics researchers is 
the model organism databases. These data- 
bases bring together bibliographic databases, 
full text, and databases of sequences, struc- 
ture, and function for organisms whose 
genomic data have been highly characterized. 
One of the oldest and most developed model 
organism databases is the Mouse Genome 
Informatics resource.*? More details are pro- 
vided in > Chap. 28. 


23.5 Indexing 


As noted at the beginning of the chapter, 
indexing is the process of assigning metadata 
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to content to facilitate its retrieval. Most mod- 

ern commercial content is indexed in two 

ways: 

1. Manual indexing— where human indexers, 
usually using a controlled terminology, 
assign indexing terms and attributes to 
documents, often following a specific pro- 
tocol. 

2. Automated indexing—where computers 
make the indexing assignments, usually 
limited to breaking out each word in the 
document (or part of the document) as an 
indexing term. 


Manual indexing is done most commonly 
with bibliographic databases and annotated 
content. In this age of proliferating electronic 
content, such as online textbooks, practice 
guidelines, and multimedia collections, man- 
ual indexing has become either too expensive 
or outright unfeasible for the quantity and 
diversity of material now available. Thus there 
are increasing numbers of databases that are 
indexed only by automated means. Before 
covering these types of indexing in detail, let 
us first discuss controlled terminologies. 


23.5.1 Controlled Terminologies 


A controlled terminology contains a set of 
terms that can be applied to a task, such as 
indexing. When the terminology defines the 
terms, it is usually called a vocabulary. When 
it contains variants or synonyms of terms, it is 
also called a thesaurus. Before discussing 
actual terminologies, it is useful to define 
some terms. A concept is an idea or object that 
exists in the world, such as the condition 
under which human blood pressure is ele- 
vated. A term is the actual string of one or 
more words that represent a concept, such as 
“Hypertension” or “High Blood Pressure”. 
One of these string forms is the preferred or 
canonical form, such as “Hypertension” in the 
present example. When one or more terms can 
represent a concept, the different terms are 
called synonyms. 

A controlled terminology usually contains 
a list of terms that are the canonical represen- 
tations of the concepts. If it is a thesaurus, it 
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contains relationships between terms, which 

typically fall into three categories: 

= Hierarchical—terms that are broader or 
narrower. The hierarchical organization 
not only provides an overview of the struc- 
ture of a thesaurus but also can be used to 
enhance searching (e.g., MeSH tree explo- 
sions that add terms from an entire por- 
tion of the hierarchy to augment a search). 

= Synonym—terms that are synonyms, 
allowing the indexer or searcher to express 
a concept in different words. 

= Related—terms that are not synonymous 
or hierarchical but are somehow otherwise 
related. These usually remind the searcher 
of different but related terms that may 
enhance a search. 


The MeSH terminology is used to manually 
index most of the databases produced by the 
NLM (Coletti and Bleich 2001). The latest 
version contains over 28,000 subject headings 
(the word MeSH uses for the canonical repre- 
sentation of its concepts). It also contains 
over 90,000 synonyms to those terms, which 
in MeSH jargon are called entry terms. MeSH 
also contains Supplementary Concept 
Records, representing 230,000 additional 
chemicals, drugs, genes, organisms, etc. that 
indexers encounter in indexing process and 
map to MeSH headings. 
MeSH contains the three types of relation- 
ships described previously: 
= Hierarchical—MeSH is organized hierar- 
chically into 16 trees, such as Diseases, 
Organisms, and Chemicals and Drugs 
= Synonym—MeSH contains a vast number 
of entry terms, which are synonyms of the 
headings 
= Related—terms that may be useful for 
searchers to add to their searches when 
appropriate are suggested for many head- 
ings 


The MeSH terminology files, their associated 
data, and their supporting documentation are 
available on the NLM’s MeSH Web site." 
There is also a browser that facilitates explo- 
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O Fig. 23.2 A slice 
through the Medical 
Subject Headings (MeSH) 
hierarchy for 


C.Diseases 
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“Hypertension” and 
related terms, showing the 
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terms in the hierarchy, 


C14.240 Cardiovascular 
Abnormalities 


ee ee, 


while the codes give the 
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permission) 
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ration of the terminology*! as well as a tool 
for mapping text to MeSH terms, called 
MeSH on Demand.” Ø Figure 23.2 shows a 
slice through the MeSH hierarchy for 
“Hypertension” and related cardiovascular 
diseases in the C. Diseases tree. 

There are features of MeSH designed to 
assist indexers in making documents more 
retrievable. One of these is subheadings, which 
are qualifiers of subject headings that narrow 
the focus of a term. In Hypertension, for 
example, the focus of an article may be on the 
diagnosis, epidemiology, or treatment of the 
condition. Another feature of MeSH that 
helps retrieval is check tags. These are MeSH 
terms that represent certain facets of medical 
studies, such as age, gender, human or nonhu- 
man, and type of grant support. Related to 
check tags are the geographical locations in 
one particular part of the MeSH hierarchy 
(called the “Z tree”, because their term codes 
start with “Z”). Indexers must also include 
these, like check tags, since the location of a 
study (e.g., Oregon) must be indicated. 
Another feature gaining increasing impor- 
tance for EBM and other purposes is the pub- 
lication type, which describes the type of 
publication or the type of study. A searcher 
who wants a review of a topic may choose the 
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publication type Review or Review Literature. 
Or, to find studies that provide the best evi- 
dence for a therapy, the publication type 
Meta-Analysis, Randomized Controlled Trial, 
or Controlled Clinical Trial would be used. 

MeSH is not the only thesaurus used for 
indexing biomedical documents. A number of 
other thesauri are used to index non-NLM 
databases. CINAHL, for example, uses the 
CINAHL Subject Headings, which are based 
on MeSH but have additional domain-specific 
terms added. EMBASE has a terminology 
called EMTREE,’’ which has many features 
similar to those of MeSH. 

One problem with controlled terminolo- 
gies, not limited to IR systems, is their prolif- 
eration. As already described in > Chap. 8, 
there is great need for linkage across these dif- 
ferent terminologies. This was the primary 
motivation for the Unified Medical Language 
System (UMLS) Project,’* which was under- 
taken in the 1980s to address this problem 
(Humphreys et al. 1998). There are three com- 
ponents of the UMLS Knowledge Sources: 
the Metathesaurus, the Semantic Network, 
and the Specialist Lexicon. The Metathesaurus 
component of the UMLS links parts or all of 
over 100 terminologies (Bodenreider 2004). 
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O Fig. 23.3 Concepts, 
terms, and strings for the 
Metathesaurus concept 
atrial fibrillation. Each 
string may occur in more 
than one vocabulary, in 
which case each would be 
an atom. (Courtesy of 
National Library of 
Medicine, with 
permission) 


Atrial Fibrillation 


771 


Atrial Fibrillation 


Auricular Fibrillation 


AF-Atrial 
Fibrillation 


Strings 
Atrial Fibrillation, 
Fibrillation Atrial 


Auricular Fibrillation, Auricular 
Fibrillation Auricular Fibrillations 
Atrial 
Fibrillation 


In the Metathesaurus, all terms that are 
conceptually the same are linked together as a 
concept. Each concept may have one or more 
terms, each of which represents an expression 
of the concept from a source terminology that 
is not just a simple lexical variant (i.e., differs 
only in word ending or order). Each term may 
consist of one or more strings that represent 
all the lexical variants that are represented for 
that term in the source terminologies. One of 
each term’s strings is designated as the pre- 
ferred form, and the preferred string of the 
preferred term is known as the canonical form 
of the concept. There are rules of precedence 
for determining the canonical form, the main 
one being that the MeSH heading is used if 
one of the source terminologies for the con- 
cept is MeSH. 

Each Metathesaurus concept has a single 
concept unique identifier (CUI). Each term 
has one term unique identifier (LUI), all of 
which are linked to the one (or more) CUIs 
with which they are associated. Likewise, each 
string has one string unique identifier (SUD, 
which is likewise linked to the LUIs in which 
they occur. In addition, each string has an 
atomic unique identifier (AUT) that represents 
information from each instance of the string 
in each vocabulary. @ Figure 23.3 depicts the 
English-language concepts, terms, and strings 
for the Metathesaurus concept atrial fibrilla- 
tion. (Each string may occur in more than one 
vocabulary, in which case each would be an 
atom.) The canonical form of the concept and 
one of its terms is atrial fibrillation. Within 


both terms are several strings that vary in 
word order and case. 

The Metathesaurus contains a wealth of 
additional information. In addition to the 
synonym relationships between concepts, 
terms, and strings described earlier, there are 
also non-synonym relationships between con- 
cepts. There are a great many attributes for the 
concepts, terms, strings, and atoms, such as 
definitions, lexical types, and occurrence in 
various data sources. Also provided with the 
Metathesaurus is a word index that connects 
each word to all the strings it occurs in, along 
with its concept, term, string, and atomic 
identifiers. 


23.5.2 Manual Indexing 


Manual indexing is most commonly done for 
bibliographic and annotated content, 
although it is sometimes for other types of 
content as well. Manual indexing is usually 
done by means of a controlled terminology of 
terms and attributes. Most databases utilizing 
human indexing usually have a detailed proto- 
col for assignment of indexing terms from the 
thesaurus. The MEDLINE database is no 
exception. The principles of MEDLINE 
indexing were laid out in the two-volume 
MEDLARS Indexing Manual (Charen 1976, 
1983). Subsequent modifications have 
occurred with changes to MEDLINE, other 
databases, and MeSH over the years. The 
major concepts of the article, usually from 
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two to five headings, are designed as main 
headings, and designated in the MEDLINE 
record by an asterisk. The indexer is also 
required to assign appropriate subheadings. 
Finally, the indexer must also assign check 
tags, geographical locations, and publication 
types. Although MEDLINE indexing is still 
manual, indexers are aided by a variety of 
electronic tools for selecting and assigning 
MeSH terms. 

Few full-text resources are manually 
indexed. One type of indexing that commonly 
takes place with full-text resources, especially 
in the print world, is that performed for the 
index at the back of the book. However, this 
information is rarely used in IR systems; 
instead, most online textbooks rely on auto- 
mated indexing (see > Sect. 23.5.3, below). 

Manual indexing of Web content would 
likewise be challenging. With billions of pages 
of content, manual indexing of more than a 
fraction of it is not feasible. On the other 
hand, the lack of a coherent index makes 
searching much more difficult, especially 
when specific resource types are being sought. 
A simple form of manual indexing of the Web 
takes place in the development of the Web 
catalogs and aggregations as described earlier. 
These catalogs contain not only explicit index- 
ing about subjects and other attributes, but 
also implicit indexing about the quality of a 
given resource by the decision of whether to 
include it in the catalog. 

Two major approaches to manual indexing 
have emerged on the Web that are often com- 
plementary. The first approach, that of apply- 
ing metadata to Web pages and sites, is 
exemplified by the Dublin Core Metadata 
Initiative (DCMI) (Weibel and Koch 2000). 
The goal of the DCMI has been to develop a 
set of standard data elements that creators of 
Web resources can use to apply metadata to 
their content. The specification has defined 15 
elements, as shown in @ Table 23.1. The 
DCMI was recently approved as a standard 
by the National Information Standards 
Organization (NISO) with the designation 
Z39.85. It is also a standard with the 
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International Organization for Standards 
(ISO), ISO Standard 15836:2009. 

There have been some medical adaptations 
of the DCMI. The most developed of these is 
the Catalogue et Index des Sites Médicaux 
Francophones (CISMeF)” (Darmoni et al. 
2000). A catalog of French-language health 
resources on the Web, CISMeF has used 
DCMI to catalog Web pages, including infor- 
mation resources (e.g., practice guidelines, 
consensus development conferences), organi- 
zations (e.g., hospitals, medical schools, phar- 
maceutical companies), and databases. The 
Subject field uses the French translation of 
MeSH but also includes the English transla- 
tion. For Type, a list of common Web 
resources has been enumerated. 

While Dublin Core Metadata was origi- 
nally envisioned to be included in Hypertext 
Markup Language (HTML) Web pages, it 
became apparent that many non-HTML 
resources exist on the Web and that there are 
reasons to store metadata external to Web 
pages. For example, authors of Web pages 
might not be the best people to index pages or 
other entities might wish to add value by their 
own indexing of content. An emerging stan- 
dard for cataloging metadata is the Resource 
Description Framework (RDF) (Akerkar 
2009). 

A second approach to manually indexing 
content on the Web has been to create directo- 
ries of content. The first major effort to create 
these was for use in the Yahoo! search engine,’ 
which created a subject hierarchy and assigned 
Web sites to elements within it. When concern 
began to emerge that the Yahoo directory was 
proprietary and not necessarily representative 
of the Web community at large, an alternative 
movement sprung up: the Open Directory 
Project (dmoz.org). Due to increasing growth 
of the Web, these projects were eventually dis- 
banded. 

Manual indexing has a number of limita- 
tions, the most significant of which is 
inconsistency. Funk and Reid (Funk and Reid 
1983) evaluated indexing inconsistency in 
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The person or organization primarily responsible for creating the intellectual content of 


O Table 23.1 Elements of Dublin Core Metadata 
Element Definition 
DC.title The name given to the resource 
DC.creator 
the resource 
DC. subject The topic of the resource 
DC.description A textual description of the content of the resource 
DC.publisher The entity responsible for making the resource available in its present form 
DC.date A date associated with the creation or availability of the resource 


DC.contributor 


A person or organization not specified in a creator element who has made a significant 
intellectual contribution to the resource but whose contribution is secondary to any person 
or organization specified in a creator element 


DC.type The category of the resource 

DC.format The data format of the resource, used to identify the software and possibly hardware that 
might be needed to display or operate the resource 

DC. identifier A string or number used to uniquely identify the resource 

DC.source Information about a second resource from which the present resource is derived 

DC. language The language of the intellectual content of the resource 


DC relation 
DC.coverage 


DC.rights 


An identifier of a second resource and its relationship to the present resource 
The spatial or temporal characteristics of the intellectual content of the resource 


A rights management statement, an identifier that links to a rights management statement, 
or an identifier that links to a service providing information about rights management for 


the resource 


MEDLINE by identifying 760 articles that 
had been indexed twice by the NLM. The 
most consistent indexing occurred with check 
tags and central concept headings, which were 
only indexed with a consistency of 61-75%. 
The least consistent indexing occurred with 
subheadings, especially those assigned to non- 
central-concept headings, which had a consis- 
tency of less than 35%. A repeat of this study 
in more recent times found comparable results 
(Marcetich et al. 2004). Manual indexing also 
takes time. While it may be feasible with the 
large resources the NLM has to index 
MEDLINE, it is probably impossible with the 
growing amount of content on Web sites and 
in other full-text resources. Indeed, the NLM 
has recognized the challenge of continuing to 
have to index the growing body of biomedical 
literature and is investigating automated and 


semiautomated means of doing so (Mork 
et al. 2017). 


23.5.3 Automated Indexing 


In automated indexing, the indexing is done 
by a computer. Although the mechanical run- 
ning of the automated indexing process lacks 
cognitive input, considerable intellectual 
effort may have gone into development of the 
system for doing it, so this form of indexing 
still qualifies as an intellectual process. In this 
section, we will focus on the automated index- 
ing used in operational IR systems, namely 
the indexing of documents by the words they 
contain. 

Some might not think of extracting all the 
words in a document as “indexing,” but from 
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the standpoint of an IR system, words are 
descriptors of documents, just like human- 
assigned indexing terms. Most retrieval sys- 
tems actually use a hybrid of human and word 
indexing, in that the human-assigned indexing 
terms become part of the document, which 
can then be searched by using the whole con- 
trolled term or individual words within it. 
With the development of full-text resources in 
the 1980s and 1990s, systems that allowed 
only word indexing began to emerge. This 
trend increased with the advent of the Web. 

Word indexing is typically done by defin- 
ing all consecutive alphanumeric sequences 
between white space (which consists of spaces, 
punctuation, carriage returns, and other non- 
alphanumeric characters) as words. Systems 
must take particular care to apply the same 
process to documents and the user’s query, 
especially with characters such as hyphens 
and apostrophes. Many systems go beyond 
simple identification of words and attempt to 
assign weights to words that represent their 
importance in the document (Salton 1991). 

Many systems using word indexing employ 
processes to remove common words or con- 
flate words to common forms. The former 
consists of filtering to remove stop words, 
which are common words that always occur 
with high frequency and are usually of little 
value in searching. The stop word list, also 
called a negative dictionary, varies in size from 
the seven words of the original MEDLARS 
stop list (and, an, by, from, of, the, with) to the 
list of 250-500 words more typically used. 
Examples of the latter are the 250-word list of 
van Rijsbergen (1979), the 471-word list of 
Fox (1992), and the PubMed stop list.”® 
Conflation of words to common forms is done 
via stemming, the purpose of which is to 
ensure words with plurals and common suf- 
fixes (e.g., -ed, -ing, -er, -al) are always indexed 
by their stem form (Frakes 1992). For exam- 
ple, the words cough, coughs, and coughing 
are all indexed via their stem cough. Both stop 
word remove and stemming reduce the size of 
indexing files and lead to more efficient query 
processing. 
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A commonly used approach for term 
weighting is TF*IDF weighting, which com- 
bines the inverse document frequency (IDF) 
and term frequency (TF). The IDF is the loga- 
rithm of the ratio of the total number of doc- 
uments to the number of documents in which 
the term occurs. It is assigned once for each 
term in the database, and it correlates inversely 
with the frequency of the term in the entire 
database. The usual formula used is: 


number of documents in database 1 


IDF (term) = log 
number of documents with term 


(23.1) 


The TF is a measure of the frequency with 
which a term occurs in a given document and 
is assigned to each term in each document, 
with the usual formula: 


TF (term,document) = frequency of term in 


document 
(23.2) 


In TF*IDF weighting, the two terms are com- 
bined to form the indexing weight, WEIGHT. 


WEIGHT (term,document) = 
TF (term,document ) * IDF (term) 
(23.3) 


Another automated indexing approach gener- 
ating increased interest is the use of link-based 
methods, fueled by the success of the Google 
search engine.” This approach gives weight to 
pages based on how often they are cited by 
other pages. The PageRank (PR) algorithm is 
mathematically complex, but can be viewed as 
giving more weight to a Web page based on 
the number of other pages that link to it (Brin 
and Page 1998). Thus, the home page of the 
NLM or a major medical journal is likely to 
have a very high PR, whereas a more obscure 
page will have a lower PR. 

General-purpose search engines such as 
Google and Microsoft Bing use word-based 
approaches and variants of the PageRank 
algorithm for indexing. They amass the con- 
tent in their search systems by “crawling” the 
Web, collecting and indexing every object they 
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find on the Web. This includes not only 

HTML pages, but other files as well, including 

Microsoft Word, Portable Document Format 

(PDF), and images. 
Word indexing has a number of limita- 

tions, including: 

= Synonymy—different words may have the 
same meaning, such as high and elevated. 
This problem may extend to the level of 
phrases with no words in common, such as 
the synonyms hypertension and high blood 
pressure. 

= Polysemy—the same word may have dif- 
ferent meanings or senses. For example, 
the word lead can refer to an element or to 
a part of an electrocardiogram machine. 

= Content—words in a document may not 
reflect its focus. For example, an article 
describing hypertension may make men- 
tion in passing to other concepts, such as 
congestive heart failure (CHF) that are not 
the focus of the article. 

= Context— words take on meaning based 
on other words around them. For example, 
the relatively common words high, blood, 
and pressure, take on added meaning when 
occurring together in the phrase high 
blood pressure. 

= Morphology— words can have suffixes 
that do not change the underlying mean- 
ing, such as indicators of plurals, various 
participles, adjectival forms of nouns, and 
nominalized forms of adjectives. 

= Granularity—queries and documents may 
describe concepts at different levels of a 
hierarchy. For example, a user might query 
for antibiotics in the treatment of a spe- 
cific infection, but the documents might 
describe specific antibiotics themselves, 
such as penicillin. 


> Chapter 9 on Natural Language Processing 
(NLP) describes automated methods for 
addressing these limitations. 


23.6 Retrieval 
There are two broad approaches to retrieval. 


Exact-match searching allows the user precise 
control over the items retrieved. Partial-match 
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searching, on the other hand, recognizes the 
inexact nature of both indexing and retrieval, 
and instead attempts to return to the user con- 
tent ranked by how close it comes to the user’s 
query. After general explanations of these 
approaches, we will describe actual systems 
that access the different types of biomedical 
content. 


23.6.1 Exact-Match Retrieval 


In exact-match searching, the IR system gives 
the user all documents that exactly match the 
criteria specified in the search statement(s). 
Since the Boolean operators AND, OR, and 
NOT are usually required to create a manage- 
able set of documents, this type of searching is 
often called Boolean searching. Furthermore, 
since the user typically builds sets of docu- 
ments that are manipulated with the Boolean 
operators, this approach is also called set- 
based searching. Most of the early operational 
IR systems in the 1950s through 1970s used 
the exact-match approach, even though Salton 
was developing the partial-match approach in 
research systems during that time (Salton and 
McGill 1983). Currently, exact-match search- 
ing tends to be associated with retrieval from 
bibliographic and annotated databases, while 
the partial-match approach tends to be used 
with full-text searching. 

Typically the first step in exact-match 
retrieval is to select terms to build sets. Other 
attributes, such as the author name, publica- 
tion type, or gene identifier (in the secondary 
source identifier field of MEDLINE), may be 
selected to build sets as well. Once the search 
term(s) and attribute(s) have been selected, 
they are combined with the Boolean opera- 
tors. The Boolean AND operator is typically 
used to narrow a retrieval set to contain only 
documents with two or more concepts. The 
Boolean OR operator is usually used when 
there is more than one way to express a con- 
cept. The Boolean NOT operator is often 
employed as a subtraction operator that is 
applied to a pair of sets, with the result being 
the documents found in the first set but not in 
the second set. Some systems more accurately 
call this the ANDNOT operator. 
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Some systems allow terms in searches to 
be expanded by using the wild-card character, 
which adds all words to the search that begin 
with the letters up until the wild-card charac- 
ter. This approach is also called truncation. 
Unfortunately, there is no standard approach 
to using wild-card characters, so syntax for 
them varies from system to system. PubMed, 
for example, allows a single asterisk at the end 
of a word to signify a wild-card character. 
Thus the query word can* will lead to the 
words cancer and Candida, among others, 
being added to the search. 


23.6.2 Partial-Match Retrieval 


Although partial-match searching was con- 
ceptualized very early, it did not see wide- 
spread use in IR systems until the advent of 
Web search engines in the 1990s. This is most 
likely because exact-match searching tends to 
be preferred by “power users” whereas partial- 
match searching is preferred by novice search- 
ers. Whereas exact-match searching requires 
an understanding of Boolean operators and 
(often) the underlying structure of databases 
(e.g., the many fields in MEDLINE), partial- 
match searching allows a user to simply enter 
a few terms and start retrieving documents. 
The development of partial-match search- 
ing is usually attributed to Salton, who pio- 


neered the approach in the 1960s (Salton and 
McGill 1983). Although  partial-match 
searching does not exclude the use of non- 
term attributes of documents, and for that 
matter does not even exclude the use of 
Boolean operators (e.g., Salton et al. 1983), 
the most common use of this type of search- 
ing is with a query of a small number of 
words, also known as a natural language 
query. Because Salton’s approach was based 
on vector mathematics, it is also referred to as 
the vector-space model of IR. In the partial- 
match approach, documents are typically 
ranked by their closeness of fit to the query. 
That is, documents containing more query 
terms will likely be ranked higher, since those 
with more query terms will in general be more 
likely to be relevant to the user. As a result 
this process is called relevance ranking. The 
entire approach has also been called lexical- 
statistical retrieval. 

The most common approach to document 
ranking in partial-match searching is to give 
each a score based on the sum of the weights 
of terms common to the document and query. 
Terms in documents typically derive their 
weight from the TF*IDF calculation described 
above. Terms in queries are typically given a 
weight of one if the term is present and zero if 
it is absent. The following formula can then be 
used to calculate the document weight across 
all query terms: 


Document weight = > Weight of term in query * Weight of term in document (23.4) 

all query terms 
This may be thought of as a giant OR of all tive for many diverse test collections 
query terms, with sorting of the matching (Robertson and Walker 1994). Another 


documents by weight. The usual approach is 
for the system to then perform the same stop 
word removal and stemming of the query that 
was done in the indexing process. (The equiv- 
alent stemming operations must be performed 
on documents and queries so that comple- 
mentary word stems will match). 

A number of other ranking algorithms 
have been developed over the years. The 
BM25 approach has been found to be effec- 


approach showing efficacy in research has 
been the use of language modeling techniques 
(Zhai and Lafferty 2004). More recently, the 
learning-to-rank approach has found value 
for machine learning approaches in the rele- 
vance ranking process (Li 2011). The large 
search engines such as Google make use of 
other information in their results, such as aim- 
ing to provide answers to queries that are 
likely to be questions (e.g., defining words or 
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Congestive heart failure (CHF) is associated with substantial morbidity and mortality, and is the 
only major cardiovascular disease increasing in prevalence. Despite abundant evidence to 
support their efficacy and cost-effectiveness, angiotensin-converting enzyme (ACE) 
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O Fig. 23.4 Search results from PubMed, showing query, results, and ability to set limits (on the left side of the 
window). (Courtesy of National Library of Medicine, with permission) 


listing airplane flights) and using other fea- 
tures of Web pages, such as their geographic 
location (e.g., a user querying for restaurants). 


23.6.3 Retrieval Systems 


This section describes searching systems used 
to retrieve content from the four categories 
previously described in » Sect. 23.4. 

As noted above, PubMed is the system at 
NLM that searches MEDLINE and other 
bibliographic databases. Although presenting 
the user with a simple text box, PubMed does 
a great deal of processing of the user’s input 
to identify MeSH terms, author names, com- 
mon phrases, and journal names (described in 
the on-line help system of PubMed). In this 


automatic term mapping, the system attempts 
to map user input, in succession, to MeSH 
terms, journals names, common phrases, and 
authors. Remaining text that PubMed cannot 
map is searched as text words (1.e., words that 
occur in any of the MEDLINE fields). A 
results screen for a search combining the dis- 
ease congestive heart failure (CHF) and the 
angiotensin converting enzyme (ACE) inhibi- 
tor class of drugs is shown in @ Fig. 23.4. 
PubMed allows the use of wild-card char- 
acters. It also allows phrase searching whereby 
two or more words can be enclosed in quota- 
tion marks to indicate they must occur adja- 
cent to each other. If the specified phrase is in 
PubMed’s phrase index, then it will be 
searched as a phrase. Otherwise the individual 
words will be searched. PubMed allows speci- 
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O Fig. 23.5 PubMed Advanced Search Builder, showing the use of sets and application of Boolean operators as 
well as limits. (Courtesy of National Library of Medicine, with permission) 


fication of other indexing attributes via 
“Limits.” These include publication types, 
subsets, age ranges, and publication date 
ranges. These are accessed from the left-hand 
side of the results screen, with the most com- 
monly used ones shown and the others acces- 
sible by additional mouse clicks. 

As in most bibliographic systems, users 
can also search PubMed by building search 
sets and then combining them with Boolean 
operators to tailor the search. This is called 
the PubMed Advanced Search Builder. 
Consider a user searching for studies assessing 
the reduction of mortality in patients with 
CHF through the use of ACE inhibitors. A 
simple approach to such a search might be to 
combine the terms ACE inhibitors and CHF 
with an AND. The easiest way to do this is to 


enter the search string congestive heart failure 
and ACI inhibitors. © Figure 23.5 shows the 
PubMed Advanced Search Builder screen 
such a searcher might develop. This searcher 
has limited the output (using some of the lim- 
its shown in B Fig. 23.4) with various publi- 
cation types known to contain the best 
evidence for this question. Also note that the 
search does not require the “and,” as PubMed 
determines the Boolean operator should be 
placed there automatically. 

Most MEDLINE systems have ranked 
output sorted by reverse chronological order, 
based on the notion that the most recent arti- 
cles have the mostly timely and complete 
information. PubMed has also featured 
relevance-ranked output and recently 
improved its algorithms (Fiorini et al. 2018). 
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O Fig. 23.6 Search screen for NLM NCBI Search, showing the variety of different databases that can be search on 
the NLM site. (Courtesy of National Library of Medicine, with permission) 


PubMed has an additional approach to 
finding the best evidence, which is through the 
use of its Clinical Queries function, where 
the subject terms are limited by search state- 
ments designed to retrieve the best evidence 
based on principles of EBM. There are two 
different approaches. The first uses strategies 
for retrieving the best evidence for the four 
major types of clinical questions. These strat- 
egies arise from research assessing the ability 
of MEDLINE search statements to identify 
the best studies for therapy, diagnosis, harm, 
and prognosis (Haynes et al. 1994). The sec- 
ond approach to retrieving the best evidence 
aims to retrieve evidence-based resources that 
are syntheses and synopses, in particular 
meta-analyses, systematic reviews, and prac- 
tice guidelines. The strategy derives in part 
from research by Boynton et al. (Boynton 
et al. 1998). When the clinical queries inter- 
face is used, the search statement is processed 


by the usual automatic term mapping and the 
resulting output is limited (via AND) with the 
appropriate statement. 

A growing number of search engines allow 
searching over many resources. The general 
search engines Google, Microsoft Bing, and 
others allow retrieval of any types of docu- 
ments they have indexed via their Web crawl- 
ing activities. Other search engines allow 
searching over aggregations of various 
sources, such as NCBI Search,°! which allows 
searching over all NLM NCBI databases and 
other resources in one simple interface, as 
shown in @ Fig. 23.6. 


23.7 Evaluation 
There has been a great deal of research over 


the years devoted to evaluation of IR systems. 
As with many areas of research, there is con- 
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troversy as to which approaches to evaluation 
best provide results that can assess searching 
in the systems they are using. Many frame- 
works have been developed to put the results 
in context. One of those frameworks orga- 
nized evaluation around six questions that 
someone advocating the use of IR systems 
might ask (Hersh and Hickam 1998): 

Was the system used? 

For what was the system used? 

How well did they use the system? 

Were the users satisfied? 

What factors were associated with success- 
ful or unsuccessful use of the system? 

6. Did the system have an impact? 


Oe Di 


While early evaluation studies asked questions 
about whether IR systems would be used if 
made available, in modern times their use is 
ubiquitous. A study by Google and Manhattan 
Research found that essentially all physicians 
reported searching on digital devices daily, 
with most searching resulting in action, such 
as changing treatment decisions or sharing 
with a colleague or patient (Anonymous 2012). 
Similarly, a recent study of internal medicine 
residents at three sites found that nearly all 
responding to the survey searched daily, with 
the most common resource searched being 
UpToDate (Duran-Nelson et al. 2013). The 
next most frequent source of information was 
consultation with attending faculty, followed 
by the Google search engine, the Epocrates 
drug reference, and various other “pocket” 
references. Another study of family medicine 
resident and attending physicians found uni- 
versal use of smartphones and tablet devices 
daily in their practices (Yaman et al. 2016). 
Patient and consumer searching of the 
Web for health information also continues to 


be reported high. In the most recent update 
of her ongoing survey of health-related 
searching, Fox found that 72% of US adult 
Internet users (59% of all US adults) have 
looked for health information in the last year 
(Fox and Duggan 2013). Earlier research 
found that the most common types of 
searches done by these users was for a specific 
disease or medical condition and for a certain 
medical treatment or procedure (Fox 2011). 
Three focus groups convened by Mayo Clinic 
researchers asked consumers about their 
online searching use and needs, finding that 
subjects reported searching, filtering, and 
comparing information retrieved, with the 
process stopping due to saturation and fatigue 
(Fiksdal et al. 2014). 

Most evaluation research has focused on 
the third question from the above list, i.e., 
how well did search systems or their users 
perform? The rest of this section on evalua- 
tion will focus on studies of that question, 
grouping approaches and studies into those 
that are system-oriented, i.e., the focus of the 
evaluation is on the IR system, and those 
that are user-oriented, i.e., the focus is on the 
user. 


23.7.1 System-Oriented Evaluation 


There are many ways to evaluate the perfor- 
mance of IR systems, the most widely used 
of which are the relevance-based measures 
of recall and precision. These measures quan- 
tify the number of relevant documents 
retrieved by the user from the database and 
in his or her search. Recall is the proportion 
of relevant documents retrieved from the 
database: 


number of retrieved and relevant documents 


Recall = 


(23.5) 


number of relevant documents in database 


In other words, recall answers the question, 
for a given search, what fraction of all the rel- 
evant documents have been obtained from the 
database? 

One problem with Eq. (23.5) is that the 
denominator implies that the total number 


of relevant documents for a query is known. 
For all but the smallest of databases, how- 
ever, it is unlikely, perhaps even impossible, 
for one to succeed in identifying all relevant 
documents in a database. Thus most studies 
use the measure of relative recall, where the 
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denominator is redefined to be the total num- 
ber of unique, relevant documents identified 
by one or more searches on the query topic. 


Precision = 


number of retrieved and relevant documents 
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Precision is the proportion of relevant 
documents retrieved in the search: 


(23.6) 


number of documents retrieved 


This measure answers the question, for a 
search, what fraction of the retrieved docu- 
ments is relevant? 

One problem that arises when one is com- 
paring systems that use ranking versus those 
that do not is that nonranking systems, typi- 
cally using Boolean searching, tend to retrieve 
a fixed set of documents and as a result have 
fixed points of recall and precision. Systems 
with relevance ranking, on the other hand, 
have different values of recall and precision 
depending on the size of the retrieval set the 
system (or the user) has chosen to show. The 
problem has been addressed by the develop- 
ment of aggregate measures that combine 
recall and precision and that account for 
ranking. One of the most common measures 
used is mean average precision (MAP), which 
measures precision at each point that a rele- 
vant document is retrieved, and then provides 
a mean of all the average precision values 
(Buckley and Voorhees 2005). 

Another challenge for IR evaluation 
occurs with large collections, where every pos- 
sible item retrieved cannot be judged for rele- 
vance. If there is concern that there are large 
numbers of unjudged documents, the B-Pref 
measure can be used, which only makes use of 
unjudged documents in its calculations 
(Buckley and Voorhees 2004). An additional 
measure increasingly used in normalized dis- 
tributed cumulative gain (NDCG), which 
allows differential value of retrieved docu- 
ments, e.g., a value of 2 for highly relevant 
and 1 for partially relevant documents 
(Jarvelin and Kekalainen 2002). 

A good deal of evaluation in IR is done 
via challenge evaluations, in which a common 
IR task is defined and a test collection of doc- 
uments, topics, and relevance judgments are 
developed. The relevance judgments define 


which documents are relevant for each topic 
in the task, allowing different researchers to 
compare their systems with others on the 
same task and improve them. The longest 
running and best-known challenge evaluation 
in IR is the Text REtrieval Conference 
(TREC),62 which is organized by the 
U.S. National Institute for Standards and 
Technology (NIST).63 Started in 1992, TREC 
has provided a testbed for evaluation and a 
forum for presentation of results. TREC is 
organized as an annual event at which the 
tasks are specified and queries and documents 
are provided to participants. Participating 
groups submit “runs” of their systems to 
NIST, which calculates the appropriate 
performance measure(s). TREC is organized 
into “tracks” geared to specific interests. A 
book summarizing the first decade of TREC 
provides more information on this important 
IR initiative that is still ongoing (EM Voorhees 
and Harman 2005). 

While TREC has been mostly focused on 
retrieval of general information sources (e.g., 
newswire, government documents, Web pages, 
etc.), there have been a number of tracks over 
the years devoted to biomedical IR. These 
tracks tended to reflect areas of biomedicine 
that were emerging importance. The first 
TREC track specific to the biomedical domain 
was the Genomics Track, due to the emer- 
gence at the time of the sequencing of human 
genome and the rise of the area of bioinfor- 
matics. A variety of literature retrieval tasks 
were developing, focused on journal article 
abstracts (from MEDLINE records) or full 
text (Hersh and Bhupatiraju 2003; Hersh 


62 » https://trec.nist.gov/ 
63 » https://www.nist.gov/ 


23 


782 W. Hersh 


et al. 2004, 2005, 2006, 2007; Hersh and 
Voorhees 2009; Roberts et al. 2009). 

A second track from the biomedical 
domain aimed to leverage the growing interest 
in processing medical records text around the 
onset of the HITECH Act. The Medical 
Records Track used a collection of de- 
identified patient records for a task aiming to 
retrieve patients who might be candidates for 
clinical studies (Voorhees 2013; Voorhees and 
Hersh 2012; Voorhees and Tong 2011). 

The next biomedical domain track was the 
Clinical Decision Support (CDS) Track, 
which aimed to retrieve full-text journal arti- 
cles from a snapshot of PubMed Central to 
identify knowledge relevant to diagnosis, test- 
ing, or treatment (Roberts et al. 2015, 2016a, 
b; Simpson et al. 2014). The CDS Track was 
refined into the Precision Medicine Track, 
which aimed to retrieve information relevance 
to the precision medicine paradigm (Roberts 
et al. 2017). 

Another annual challenge evaluation, 
based in Europe, has been the Conference and 
Labs of the Evaluation Forum (CLEF, origi- 
nally known as the Cross-Language 
Evaluation Forum). Since 2013, one focus of 
CLEF has been eHealth, with tasks focused 
not only in IR, but also information extrac- 
tion and information management.64 The 
track has had a patient-centered retrieval task 
since 2013, a cross-language retrieval task 
since 2014, and a systematic review task start- 
ing in 2017. An additional challenge evalua- 
tion emanating from CLEF and focused on 
image retrieval, which has included a medical 
image retrieval component, has been 
ImageCLEF.© Some recent overviews of the 
state of the art of image retrieval have been 
published (Li et al. 2018; Müller and Unay 
2017). 

Some system-oriented studies have focused 
on specific use cases for IR systems. One area 
gaining a good deal of attention has been 
reducing the workload of performing system- 
atic reviews, which require high recall to 
retrieval all possibly relevant studies. This 
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comes at a cost of low precision, so a question 
is how to reduce the work of systematic 
reviews by increasing precision while mini- 
mally impacting recall. Early work in this area 
was carried out by Cohen et al. (2009, 2015), 
demonstrating value for machine learning 
approaches. A systematic review task was 
added to CLEF eHealth in 2017 (Kanoulas 
et al. 2017, 2018, 2019). 

A number of researchers have criticized or 
noted the limitations of relevance-based mea- 
sures. While no one denies that users want sys- 
tems to retrieve relevant articles, it is not clear 
that the quantity of relevant documents 
retrieved is the complete measure of how well 
a system performs (Harter 1992; Swanson 
1988). Hersh (1994) noted that clinical users 
are unlikely to be concerned about these mea- 
sures when they simply seek an answer to a 
clinical question and are able to do so no mat- 
ter how many other relevant documents they 
miss (lowering recall) or how many nonrele- 
vant ones they retrieve (lowering precision). 
This has led to more focus on user-oriented 
evaluation. 


23.7.2 User-Oriented Evaluation 


What alternatives to relevance-based mea- 
sures can be used for determining perfor- 
mance of individual searches? Some 
alternatives have focused on users being able 
to perform various information tasks with IR 
systems, such as finding answers to questions 
(Egan et al. 1989; Hersh and Hickam 1995; 
Hersh et al. 1996; Mynatt et al. 1992; 
Wildemuth et al. 1995). For several years, 
TREC featured an Interactive Track that had 
participants carry out user experiments with 
the same documents and queries (Hersh 
2001). A number of user-oriented evaluations 
have been performed over the years looking at 
users of biomedical information. 

When end-user retrieval systems first 
appeared, a number of studies appeared aim- 
ing to measure search performance by clini- 
cians. One of the original studies compared 
the capabilities of librarian and clinician 
searchers (Haynes et al. 1990). In this study, 
78 searches were randomly chosen for replica- 
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tion by both a clinician experienced in search- 
ing and a medical librarian. The results 
showed that the experienced clinicians and 
librarians achieved comparable recall in the 
range of 50%, although the librarians had 
better precision. The novice clinician search- 
ers had lower recall and precision than either 
of the other groups. This study also assessed 
user satisfaction of the novice searchers, who 
despite their recall and precision results said 
that they were satisfied with their search out- 
comes. A follow-up study noted that different 
searchers tended to use different strategies on 
a given topic. The different approaches repli- 
cated a finding known from other searching 
studies in the past, namely, the lack of overlap 
across searchers of overall retrieved citations 
as well as relevant ones. Thus, even though the 
novice searchers had lower recall, they did 
obtain a great many relevant citations not 
retrieved by the two expert searchers. 
Furthermore, fewer than 4% of all the rele- 
vant citations were retrieved by all three 
searchers. 

Recognizing the limitations of recall and 
precision for evaluating clinical users of IR 
systems, subsequent studies assessed the abil- 
ity of systems to help students and clinicians 
answer clinical questions. The rationale for 
these studies is that the usual goal of using an 
IR system is to find an answer to a question. 
While the user must obviously find relevant 
documents to answer that question, the quan- 
tity of such documents is less important than 
whether the question is successfully answered. 
In fact, recall and precision can be placed 
among the many factors that may be 
associated with ability to complete the task 
successfully. 

The first study using this task-oriented 
approach compared Boolean versus natural 
language searching in an online medical text- 
book (Hersh and Hickam 1995). There was 
no difference in ability to answer questions 
with one interface or the other. Most answers 
were found on the first search to the textbook. 
For the questions that were incorrectly 
answered, the document with the correct 
answer was actually retrieved by the user two- 
thirds of the time and viewed more than half 
the time. 
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Another study compared Boolean and 
natural language searching of MEDLINE 
with two commercial products, CD Plus (now 
Ovid) and Knowledge Finder representing 
Boolean and natural language searching 
respectively (Hersh et al. 1996). Sixteen medi- 
cal students were recruited and randomized 
to one of the two systems and given three yes/ 
no clinical questions to answer. The students 
were able to use each system successfully, 
answering 37.5% correctly before searching 
and 85.4% correctly after searching. There 
were no significant differences between the 
systems in time taken, relevant articles 
retrieved, or user satisfaction. This study 
demonstrated that both types of systems 
could be used equally well with minimal 
training. 

A more comprehensive study looked at 
MEDLINE searching by medical and nurse 
practitioner (NP) students to answer clinical 
questions. A total of 66 medical and NP stu- 
dents searched five questions each (Hersh 
et al. 2002). This study used a multiple-choice 
format for answering questions that also 
included a judgment about the evidence for 
the answer. Subjects were asked to choose 
from one of three answers: 
= Yes, with adequate evidence. 
= Insufficient evidence to answer question. 
= No, with adequate evidence. 


Both groups achieved a presearching correct- 
ness on questions about equal to chance 
(32.3% for medical students and 31.7% for NP 
students). However, medical students 
improved their correctness with searching (to 
51.6%), whereas NP students hardly did at all 
(to 34.7%). 

This study also attempted to measure what 
factors might influence searching. A multi- 
tude of factors, such as age, gender, computer 
experience, and time taken to search, were not 
associated with successful answering of ques- 
tions. Successful answering was, however, 
associated with answering the question cor- 
rectly before searching, spatial visualization 
ability (measured by a validated instrument), 
searching experience, and EBM question type 
(prognosis questions easiest, harm questions 
most difficult). An analysis of recall and pre- 
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cision for each question searched demon- 
strated a complete lack of association with 
ability to answer these questions. 

Two studies extended this approach in dif- 
ferent ways. Westbook et al. assessed use of an 
online evidence systems and found that physi- 
cians answered 37% of questions correctly 
before use of the system and 50% afterwards, 
while nurse specialists answered 18% of ques- 
tions correctly and also 50% afterwards 
(Westbrook et al. 2005). Those who had cor- 
rect answers before searching had higher con- 
fidence in their answers, but those not initially 
knowing the answer had no difference in con- 
fidence whether their answer turned out to be 
right or wrong. McKibbon and Fridsma per- 
formed a comparable study of allowing physi- 
cians to seek answers to questions with 
resources they normally use (McKibbon and 
Fridsma 2006) employing the same questions 
as Hersh et al. (2002). This study found no dif- 
ference in answer correctness before or after 
using the search system. 

Pluye et al. (Pluye and Grad 2004) per- 
formed a qualitative study assessing impact of 
IR systems on physician practice. The study 
identified 4 themes mentioned by physicians: 
= Recall—of forgotten knowledge 
= Learning—new knowledge 
= Confirmation—of existing knowledge 
= Frustration—that system use not successful 


The researchers also noted two additional 

themes: 

= Reassurance—that system is available 

= Practice improvement—of patient- 
physician relationship 


More recent studies have focused on searchers 
using well-known modern IR systems. Kim 
et al. looked at the ability of internal medicine 
interns to answer questions starting from 
Google versus an evidence-based summary 
resource developed by a local medical library 
(Kim et al. 2014). Ten questions were given to 
each subject, with each participant random- 
ized to start in either Google or the summary 
resource for half of questions. Answers were 
found for 82% of the questions administered, 
with no difference between groups in correct 


answers (58-62% correct) or time taken (136— 
139 seconds). While those starting in the sum- 
mary resource mostly found answers in 
resources that were part of the summary sys- 
tem 93% of the time, those starting with 
Google found answers in commercial medical 
portals (25.7%), hospital Web sites (12.6%), 
Wikipedia (12.0%), US government Web sites 
(9.4%), PubMed (9.4%), evidence-based sum- 
mary resources (9.4%), and others (18%). 
Another study looked at medical students’ 
short-term knowledge when randomized to 
answer questions in Wikipedia, UpToDate, 
and a digital textbook, finding the best short- 
term knowledge acquisition with Wikipedia 
(Scaffidi et al. 2017). 

Koopman et al. assessed factors compris- 
ing effective queries and those making them 
(Koopman et al. 2017). They found that query 
formulation had more impact on retrieval 
effectiveness than the particular retrieval sys- 
tems used. The most effective queries were 
short, ad-hoc keyword queries and queriers 
who inferred novel keywords most likely to 
appear in relevant documents. 

Other users of IR systems have been stud- 
ied beyond clinicians. One study used the 
TREC Genomics Track 2004 collection to 
assess the value of MeSH terms for different 
types of searchers (Liu and Wacholder 2017). 
The researchers recruited four types of 
searchers: 
= Search Novice (SN) — undergraduates with 

no formal search training or advanced 

knowledge in biomedicine 

= Domain Expert (DE) — biomedical gradu- 
ate students 

= Search Expert (SE) - library and informa- 
tion science graduate students 

= Medical Librarian (ML) 


The searchers used a digital library system to 
search on 20 topics from the original test col- 
lection. Searchers assigned to search with 
MeSH were provided access to a MeSH 
browser. As with other studies, recall (0.15- 
0.23) and precision (0.29-0.40) were relatively 
close across different groups. MeSH terms 
had little impact upon recall in the four 
groups, but they were found to substantially 
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increase precision in search novices (SN and 
DE) and decrease it in search experts (SE and 
ML) (recall and precision with MeSH; with- 
out MeSH). User characteristics that 
improved precision were number of under- 
graduate and graduate biology courses for SN 
and DE respectively. User characteristics 
associated with improved recall included hav- 
ing had online search courses and MeSH use 
experience. Other factors having no associa- 
tion with search results included gender, 
native language, age, or experience or fre- 
quency with database searching. 

Another group of searchers that have been 
studied are consumers. A study from Mayo 
Clinic analyzed search queries submitted 
through general search engines but leading 
users into a consumer health information por- 
tal from computers and mobile devices 
(Jadhav et al. 2014). The most common types 
of searches were on symptoms (32-39%), 
causes of disease (19-20%), and treatments 
and drugs (14-16%). Health queries tended to 
be longer and more specific than general (non- 
health) queries. Health queries were somewhat 
more likely to come from mobile devices. Most 
searches used key words, although some were 
also phrased as questions (wh- or yes/no). 

An additional study aimed to assess differ- 
ences in searching between medical experts 
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O Fig. 23.7 Funnel of knowledge discovery, showing 
how an information need starts with a search (informa- 
tion retrieval) leading to a large possibly relevant set of 
literature that is winnowed down to a smaller definitely 
relevant set (usually by human inspection but with tech- 
niques like information extraction and text mining pos- 
sibly automating the process in the future). Ultimately 
actionable knowledge is obtained that can be applied by 
a human or fashioned into, for example, rules for a com- 
puter-based decision support system 
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and lay people (Palotti et al. 2016). This study 
found that medical experts were more 
persistent in their interaction with the search 
engine. They also noted that the main focus of 
users, both laypeople and professionals, was 
on disease rather than symptoms. 


23.8 Research Directions 


The above evaluation research shows that 
there is still plenty of room for IR systems to 
improve their abilities. In addition, there will 
be new challenges that arise from growing 
amounts of information, new devices, and 
other new technologies. 

There are also other areas related to IR 
where research is ongoing in the larger quest 
to help all involved in biomedicine and 
health—including patients, clinicians and 
researchers—to better apply knowledge to 
improve health. @ Figure 23.7 shows this 
author’s “funnel” by which the user searches 
all of the scientific literature using IR systems 
to obtain a set of possibly relevant literature. 
In the current state of the art, he/she reviews 
this literature by hand, selecting which articles 
are definitely relevant and may become 
“actionable knowledge” that can be acted 
upon to make better decisions. 

Our ability to carry out the activities in the 
upper part of the funnel, i.e., IR, is much bet- 
ter than those in the lower part. These areas 
include: 
= Information extraction and text mining— 

usually through the use of natural lan- 

guage processing (NLP, see > Chap. 8) to 
extract facts and knowledge from text 

(K. Cohen and Demner-Fushman 2014). 

These techniques are often employed to 

extract information from the EHR, with a 

wide variety of accuracy as shown in a sys- 

tematic review (Stanfill et al. 2010). 
= Summarization—providing automated 

extracts or abstracts summarizing the con- 

tent of longer documents (Fiszman et al. 

2004; Mani 2001). In recent years, these 

methods have been applied to text and 

other data in the EHR (Pivovarov and 

Elhadad 2015). 
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= Question-answering—going beyond 
retrieval of documents to providing actual 
answers to questions, as exemplified by 
IBM’s Watson system (Ferrucci et al. 
2010). Watson has been evaluated for 
answering questions on medical board 
exams (Ferrucci et al. 2012) as well as 
making cancer treatment recommenda- 
tions (Somashekhar et al. 2018). 


23.9 Digital Libraries 


Discussion of IR “systems” thus far has 
focused on the provision of retrieval mecha- 
nisms to access online content. Even with the 
expansive coverage of some IR systems, such 
as Web search engines, they are often part of a 
larger collection of services or activities. An 
alternative perspective, especially when com- 
munities and/or proprietary collections are 
involved, is the digital library. Digital librar- 
ies share many characteristics with “brick and 
mortar” libraries, but also take on some addi- 
tional challenges. Borgman (1999) noted that 
libraries of both types elicited different defini- 
tions of what they actually are, with research- 
ers tending to view libraries as content 
collected for specific communities and librari- 
ans alternatively viewing them as institutions 
or services. Lindberg and Humphreys (2005) 
laid out a vision in 2005 for libraries 10 years 
hence, noting that while collections would be 
virtual and accessed in many diverse ways, 
other elements of science would stay intact, 
including journals and the peer review process. 

This section provides an overview of key 
issues of digital libraries, with an orientation 
toward biomedical libraries. 


Functions and Definitions 
of Libraries 


23.9.1 


The central function of libraries is to main- 
tain collections of published literature. They 
may also store unpublished literature in 
archives, such as letters, notes, and other doc- 
uments. The general focus on published litera- 
ture has implications. One of these is that, for 


the most part, quality control can be taken for 
granted. Until recently, most published litera- 
ture came from commercial publishers and 
specialty societies that had processes such as 
peer review, which, although imperfect, 
allowed the library to devote minimal 
resources to assessing their quality. While 
libraries can still cede the judgment of quality 
to these information providers in the Internet 
era, they cannot ignore the myriad of infor- 
mation published only on the Internet, for 
which the quality cannot be presumed. 

Other functions of libraries besides main- 
taining collections include cataloging and 
classification of items in those collections, 
being a place (even virtual) where individuals 
could go to get assistance with information 
seeking, and providing space for work or 
study, particularly in universities. 

The paper-based nature of traditional 
libraries carried a number of assumptions 
that are challenged in the digital era. For 
example, items were produced in multiple 
copies, freeing the individual library from 
excessive worry that an item could not be 
replaced. In addition, items were fairly static, 
simplifying their cataloging. With digital 
libraries, this status quo is challenged. There 
is a great deal of concern about archiving of 
content and managing its change when fewer 
“copies” of it exist on the file servers of pub- 
lishers and other organizations. A related 
problem for digital libraries is that they do not 
own the “artifact” of the paper journal, book, 
or other item. This is exacerbated by the fact 
that when a subscription to an electronic jour- 
nal is terminated, access to the entire journal 
is lost; that is, the subscriber does not retain 
accumulated back issues, as was taken for 
granted with paper journals. 


23.9.2 Access 


Probably every Web user is familiar with click- 
ing on a Web link and receiving an error mes- 
sage that a page cannot found. Digital libraries 
and commercial publishing ventures need 
mechanisms to ensure that documents have 
persistent identifiers so that when the 
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document itself physically moves, it is still 
obtainable. The original architecture for the 
Web envisioned by the Internet Engineering 
Task Force was to have every uniform resource 
locator (URL), the address entered into a Web 
browser or used ina Web hyper-link, linked to 
a uniform resource name (URN) that would 
be persistent (Sollins and Masinter 1994). The 
combination of a URN and a URL, a uni- 
form resource identifier (URI), would provide 
persistent access to digital objects. However, 
no publicly available resource for resolving 
URNs and URIs was ever implemented on a 
large scale. 

One approach that has seen widespread 
adoption by publishers, especially scientific 
journal publishers, is the digital object identi- 
fier (DOI) (Paskin 2006). The DOI has 
recently been given the status of a standard 
by the NISO with the designation Z39.84. 
The DOI itself is relatively simple, consisting 
of a prefix that is assigned by the International 
DOI Foundation (IDF) to the publishing 
entity and a suffix that is assigned and main- 
tained by the entity. For example, the DOI for 
articles from the Journal of the American 
Medical Informatics Association have the 
prefix 10.1197 and the suffix jamia. M####, 
where #### is a number assigned by the 
journal editors. Publishers are encouraged to 
facilitate resolution by encoding the DOI 
into their URLs in a standard way, e.g., 
> https://doi.org/10.1197/jamia.M0996 for a 
paper cited earlier in the chapter (Hersh et al. 
2002). 


23.9.3 Interoperability 


As noted throughout this chapter, metadata is 
a key component for accessing content in IR 
systems. It takes on an additional value in the 
digital library, where there is desire to allow 
access to diverse but not necessarily exhaus- 
tive resources. One key concern of digital 
libraries is interoperability (Besser 2002). That 
is, how can resources with heterogeneous 
metadata be accessed? Arms et al. note that 
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three levels of agreement must be achieved in 

digital libraries: 

1. Technical agreements over formats, proto- 
cols, and security procedures 

2. Content agreement over the data and the 
semantic interpretation of its metadata 

3. Organizational agreements over ground 
rules for access, preservation, payment, 
authentication, and so forth 


23.9.4 Intellectual Property 


Intellectual property issues are a major con- 
cern in digital libraries. Intellectual property is 
difficult to protect in the digital environment 
because although the cost of production is not 
insubstantial, the cost of replication is near 
nothing. Furthermore, in circumstances such 
as academic publishing, the desire for protec- 
tion is situational. For example, individual 
researchers may want the widest dissemina- 
tion of their research papers, but each one 
may want to protect revenues realized from 
synthesis works or educational products that 
are developed. The global reach of the Internet 
has required that intellectual property issues 
be considered on a global scale. The World 
Intellectual Property Organization (WIPO)*’ 
is an agency of the United Nations devoted to 
developing worldwide policies, although 
understandably, there is considerable diversity 
about what such policies should be. 


23.9.5 Preservation 


Another function of libraries of all types is 
preservation of materials. In paper-based 
libraries, the goal of preservation was the 
survival of the physical object, i.e., the book, 
journal, image, etc. that could become lost, 
stolen, or deteriorated. Preservation issues in 
digital libraries are somewhat different. 
Digital libraries still do need to be concerned 
with physical survival of the information. 
Lesk compared the longevity of digital mate- 
rials (Lesk 2005). He noted that the longev- 


66 » http://www.doi.org/ 
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ity for magnetic materials was the least, with 
the expected lifetime of magnetic tape being 
5 to 10 years. Optical storage has somewhat 
better longevity, with an expected lifetime of 
30 to 100 years depending on the specific 
type. Ironically, paper has a life expectancy 
well beyond all these digital media. 
Rothenberg noted that the Rosetta Stone, 
which provided help in interpreting ancient 
Egyptian hieroglyphics and has survived 
over 20 centuries (Rothenberg 1999). He reit- 
erated Lesk’s description of the reduced life- 
time of digital media in comparison with 
traditional media, and to note another prob- 
lem familiar to most long-time users of com- 
puters, namely, data can become obsolete 
not only owing to the medium, but also as a 
result of data format. Both authors noted 
that storage devices as well as computer 
applications, such as word processors, have 
seen their formats change significantly over 
the last couple of decades. 

The US Library of Congress has devoted 
considerable effort to digital preservation, 
documenting its efforts on the Web site.°® An 
early digital preservation effort in the US was 
National Digital Information Infrastructure 
Preservation Program (NDIIPP) of the 
Library of Congress, which has now become 
the National Digital Stewardship Alliance 
(NDSA)® and is housed by the Digital 
Library Federation (DLF), at the Council on 
Library and Information Resources (CLIR). 
Other digital preservation efforts include 
Portico, a collaboration of publishers, 
libraries, and government agencies to preserve 
electronic scholarly content and LOCKSS 
(Lots of Copies Keep Stuff Safe),’! which 
provides libraries with digital preservation 
tools and support. 


68 » http://www.digitalpreservation.gov/ 
69 » https://ndsa.org 

70 » https://www.portico.org/ 

71 » https://www.lockss.org/ 


23.10 Future Directions for IR 
Systems and Digital 
Libraries 


There is no doubt that considerable progress 

has been made in IR and digital libraries. 

Seeking online information is now done rou- 

tinely not only by clinicians and researchers, 

but also by patients and consumers. There are 

still considerable challenges to make this 

activity more fruitful to users. They include: 

= How do we lower the effort it takes for cli- 
nicians to get to the information they need 
rapidly in the busy clinical setting? 

= How can researchers extract new knowl- 
edge from the vast quantity that is avail- 
able to them? 

= How can consumers and patients find 
high-quality information that is appropri- 
ate to their understanding of health and 
disease? 

= Can the value added by the publishing 
process be protected and remunerated 
while making information more available? 

= How can the indexing process become 
more accurate and efficient? 

= Can retrieval interfaces be made simpler 
without giving up flexibility and power? 

= Can we develop standards for digital 
libraries that will facilitate interoperability 
but maintain ease of use, protection of 
intellectual property, and long-term pres- 
ervation of the archive of science? 


© Suggested Readings 

Baeza-Yates, R., & Ribeiro-Neto, B. (2011). 
Modern information retrieval: The concepts 
and technology behind search (2nd ed.). 
Reading: Addison-Wesley. A book surveying 
most of the automated approaches to infor- 
mation retrieval. 

Croft, W., Metzler, D., & Strohman, T. (2009). 
Search engines: Information retrieval in prac- 
tice. Boston: Addison-Wesley. A book survey- 
ing most of the automated approaches to 
search engines. 

Hersh, W. (2020). Information retrieval: A bio- 
medical and health perspective (4th ed.). 
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New York: Springer. A textbook on informa- 
tion retrieval systems in the health and bio- 
medical domain that covers state-of-the-art as 
well as research systems. 

Miles, W. (1982). A history of the national library 
of medicine: The nation’s treasury of medical 
knowledge. Bethesda: U.S. Department of 
Health and Human Services. A comprehen- 
sive history of the National Library of 
Medicine and its forerunners, covering the 
story of Dr. John Shaw Billings and his found- 
ing of Index Medicus to the modern imple- 
mentation of MEDLINE. 

National Academies of Sciences, Engineering, 
and Medicine; Policy and Global Affairs; 
Board on Research Data and Information; 
Committee on Toward an Open Science 
Enterprise. (2018). Open science by design — 
realizing a vision for 21st century research. 
Washington, DC: National Academies Press. 
A vision for open science from the National 
Academies of Medicine, Science, and 
Engineering. 


Q Questions for Discussion 

1. With the advent of full-text searching, 
should the National Library of 
Medicine abandon human indexing of 
citations in MEDLINE? Why or why 
not? 

2. Explain why you think open-access pub- 
lishing will succeed or not. 

3. How would you aggregate the clinical 
evidence-based resources described in 
the chapter into the best digital library 
for clinicians? 

4. Devise a curriculum for teaching clini- 
cians and patients the most important 
points about searching for health-related 
information. 

5. Find aconsumer-oriented Web page and 
determine the quality of the information 
on it. 

6. What are the limitations of recall and 
precision as evaluation measures and 
what alternatives would improve upon 
them? 

7. Select a concept that appears in two or 
more clinical terminologies and demon- 
strate how it would be combined into a 
record in the UMLS Metathesaurus. 
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8. Describe how you might devise a 
system that achieves a happy medium 
between protection of intellectual 
property and barrier-free access to the 
archive of science. 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What are the key motivations for clini- 
cal decision support? 

= What are typical design considerations 
when building a decision-support sys- 
tem? 

= What are some ways in which develop- 
ers of decision-support systems encode 
and represent clinical knowledge? 

= What are some current standards in the 
HIT industry that facilitate the con- 
struction of decision-support applica- 
tions? 

= What are the current main areas of 
research and development in clinical 
decision-support systems? 


In this chapter, we discuss information tech- 
nology aimed at furnishing clinical decision 
support (CDS) — the process that “provides 
clinicians, staff, patients, or other individu- 
als with knowledge and person-specific infor- 
mation, intelligently filtered or presented at 
appropriate times, to enhance health and 
health care” (Osheroff et al. 2007). CDS 
systems (often referred to as CDSSs) com- 
municate information that takes into consid- 
eration the particular clinical context, offering 
situation-specific information and recommen- 
dations. Generally, we think of such systems 
as reasoning about the clinical situation and 
presenting their conclusions as recommenda- 
tions to the user. Information retrieval systems 
that find relevant information from reposi- 
tories of documents with high relevance to a 
specific patient and clinical context can also 
serve in this role, but do not themselves make 
patient-specific recommendations for care. 
CDS systems do not directly perform clinical 
decision making; they provide relevant knowl- 
edge and analyses that enable the ultimate 
decision makers—clinicians, patients, and 
health-care organizations—to develop more 
informed judgments; hence the importance 
of the word “support” in the term “CDS”. 
(Closed-loop systems such as in implantable 
cardioverter defibrillators and other devices 
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such as insulin pumps are exceptions to this 
model; decision support for these systems is 
not covered in this Chapter.) Ideally, CDSSs 
can be described in terms of the five rights 
that they aim to accomplish: “provide the 
right information, to the right person, in the 
right format, through the right channel, at 
the right point in workflow, to improve health 
and health-care decisions and outcomes” 
(Osheroff et al. 2012). 

Systems that provide CDS may be con- 
sidered in terms of three basic categories: 
(1) those that use information about the cur- 
rent clinical context to retrieve highly rel- 
evant online documents, as with so-called 
“Infobuttons” (introduced in » Chap. 14); 
(2) those that are intelligent systems that pro- 
vide patient-specific, situation-specific alerts, 
advice, reminders, physician order sets, or 
other recommendations for direct action; or 
(3) those that organize and present informa- 
tion in a way that facilitates problem solv- 
ing and decision making, as in dashboards, 
graphical displays, documentation templates, 
structured reports, and order sets. 

Systems that create order sets offer an 
ideal example of CDS, not only because they 
can facilitate decision making by providing an 
actionable set of recommendations such as a 
combination of orders, but also because they 
provide a mnemonic function by gathering 
together items that should be associated in a 
particular setting. Order sets also can enhance 
workflow by providing a means to select a 
group of relevant activities quickly. Not all 
CDS systems have the ability to optimize 
workflow, as we shall discuss. In fact, some 
CDS systems are subject to concern if they 
are poorly implemented, as they actually can 
impede workflow or usability. 

This chapter provides a review of computer- 
based decision aids, emphasizing their role and 
adoption within the current health-care milieu 
of the United States while keeping an eye on 
global trends. It offers some thoughts on the 
nature of the decision-making process, then it 
provides a description of current implemen- 
tation strategies and challenges, and it closes 
with a discussion of critical research questions 
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that must be addressed to ensure optimal effec- 
tiveness of CDS in clinical practice. 


24.1 The Nature of Clinical Decision 
Making 


If you ask lay people what the phrase “com- 
puters in medicine” means, they often describe 
a computer program that helps physicians to 
make diagnoses. Although computers play 
numerous important clinical roles, people 
have recognized, from the earliest days of 
computing, that computers might support 
health-care workers by helping these people 
to sift through the vast collections of possible 
diseases, findings, and treatments. 

We can view nearly all the contents of this 
book as addressing clinical data and clini- 
cal decision making. In » Chap. 2, we dis- 
cussed the central role of accurate, complete, 
and relevant data in supporting the decisions 
that confront clinicians and other health- 
care workers. In > Chap. 3, we described 
the nature of good decisions and the need 
for clinicians to understand the proper use 
of information if they are to be effective and 
efficient decision-makers. In » Chap. 4 we 
introduced the cognitive issues that underlie 
clinical decision-making and that influence 
the design of systems for decision support. 
Subsequent chapters have mentioned many 
real or potential uses of computers to assist 
with such decision-making. Medical practice 
is medical decision-making, so most applica- 
tions of computers in health care are intended 
to have a direct or indirect effect on the qual- 
ity of health-care decisions. In this chapter, we 
bring together these themes by concentrating 
on methods and systems that have been devel- 
oped specifically to assist health workers in 
making decisions. 

By now, you are familiar with the range of 
clinical decisions. The classic problem of diag- 
nosis (analyzing available data to determine the 
pathophysiologic explanation for a patient’s 
symptoms) is only one of these. Equally chal- 
lenging, as emphasized in > Chaps. 3 and 4, is 
the diagnostic process—deciding which ques- 
tions to ask, tests to order, or procedures to 


perform, and assessing the value of the results 
that can be obtained in relation to associ- 
ated risks or financial costs. Thus, diagnosis 
involves deciding not only what is true about 
a patient, but also what data are needed to 
determine what is true. Even when the diag- 
nosis is known, there often are challenging 
management decisions that test the physician’s 
knowledge and experience: Should I treat the 
patient or allow the process to resolve on its 
own? If treatment is indicated, what should it 
be? How should I use the patient’s response to 
therapy to guide me in determining whether 
an alternate approach should be tried or, in 
some cases, to question whether my initial 
diagnosis was incorrect after all? (In that 
sense, the response to treatment is also a type 
of diagnostic test.) Also, when a clinician and 
a patient are faced with alternative treatments, 
and they seek help to choose among them, the 
estimation of the chance for cure or the risk 
of death or of complications is an important 
decision-making activity. Lastly, many disease 
processes evolve over time, and evaluation and 
change in treatment must evolve with them, 
resulting in the need for guidelines for man- 
agement that take into account the temporal 
aspects of prior states and activities in order 
to decide what to do next. Decision making 
also may involve integrating data from mul- 
tiple providers, treatments, and responses to 
them, as well as increased use of personal sen- 
sors and apps to provide additional data, so 
that aspects of care coordination over time 
play an important role. This is beyond our 
scope, but the topic is addressed in part in 
> Chaps. 21, 22 and 23. 

Biomedicine is also replete with decision 
tasks that do not involve specific patients or 
their diseases. Consider, for example, the bio- 
medical scientist who is using laboratory data 
to help with the design of her next experi- 
ment or the hospital administrator who uses 
management data to guide decisions about 
resource allocation in his hospital. In addition, 
new financial models for health-care payment 
or reimbursement based on value (roughly, 
patient outcomes and cost of care), rather 
than on fee-for-service calculations based on 
individual patient care activities, require that 
decisions be made on the basis of aggregate 
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data on outcomes for groups of patients to 
determine expected norms and to identify 
outliers. Although we focus on systems to 
assist with clinical decisions in this chapter, 
we emphasize that the concepts discussed 
generalize to many other problem areas as 
well. In > Chap. 29, for example, we exam- 
ine the need for formal decision techniques 
and tools in creating health policies. As we 
develop databases that can identify patients 
with specific diseases, with risks of compli- 
cations, or in need of specific interventions 
such as screening tests or immunizations (see 
> Chap. 18), population management can be 
used to provide a form of decision support for 
groups of patients. Some clinical decision sup- 
port is also aimed directly at patients, in terms 
of alerts, reminders, or aids to interpretation 
of information, especially given increasing use 
of smartphone apps and connected sensors 
and devices; techniques for assessing prog- 
nosis and risk of alternative strategies should 
involve shared decision making between pro- 
viders and patients, which is also an impor- 
tant area of activity. 

In this chapter, we focus on decision aids 
for the provider in particular—the clinician 
seeing the patient at the point of care. The 
requirements for excellent decision-making 
fall into three principal categories: (1) accu- 
rate data, (2) pertinent knowledge, and (3) 
appropriate problem-solving, or clinical rea- 
soning, skills. 

The data about a patient must be ade- 
quate—both accurate and sufficiently com- 
prehensive to include everything relevant for 
making an informed decision—but they must 
not be excessive (see ® Chap. 4). Indeed, a 
major challenge occurs when decision-makers 
are bombarded with so much information 
that they cannot process and synthesize the 
information intelligently and rapidly (see, for 
example, > Chap. 21). Thus, it is important 
to know when additional data will confuse 
rather than clarify and when it is imperative 
to use tools (computational, visual, or other- 
wise) that permit data to be summarized for 
easier cognitive management (see > Chap. 4). 
Operating rooms and intensive-care units are 
classic settings in which this problem arises; 
patients are monitored extensively, numerous 
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data are collected, and decisions often have to 
be made on an urgent basis. 

Access to data from all relevant sources 
is required but difficult to achieve in prac- 
tice. Patients may be seen in different venues, 
including a primary care office (perhaps via 
a telecare visit), a specialist office, an emer- 
gency room, a laboratory or imaging facility, 
a hospital, an extended care facility, or they 
may be monitoring themselves at home. Each 
venue may be using different and incompat- 
ible EHRs or data repositories, terminolo- 
gies, and data representation models. Access 
to these data and interoperability across data 
repositories may be limited. Typically, the 
data available are only those within a health 
system and its EHR-which may include affili- 
ated office practices, clinics, emergency rooms, 
and hospitals, or by interaction with an avail- 
able health information exchange (HIE) (see 
> Chap. 15). 

Equally important is the quality of the 
available data to input to a CDSS. In > Chap. 
2, we discussed imprecision in terminology, 
illegibility and inaccessibility of records, and 
other opportunities for misinterpretation of 
data. Similarly, measurement instruments or 
recorded data may simply be erroneous; use 
of faulty data can have serious adverse effects 
on patient-care decisions. Thus, clinical data 
often need to be validated. 

Even good data are useless if we do not 
have the knowledge necessary to apply them 
properly. Decision-makers must have broad 
knowledge of medicine, in-depth familiarity 
with their area of expertise, and access to per- 
tinent additional information resources. Their 
knowledge must be accurate, with areas of 
controversy well understood and questions of 
personal choice well distinguished from those 
where a more prescriptive approach is appro- 
priate. Their knowledge must also be current; 
in the rapidly changing world of medicine, 
facts decay just as certainly as dead tissue 
does. 

Good data and an extensive factual knowl- 
edge base still do not guarantee a good deci- 
sion; good problem-solving skills are equally 
important. Decision-makers must know how 
to set appropriate goals for a task, how to rea- 
son about each goal, and how to make explicit 
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the trade-offs between costs and benefits of 
diagnostic procedures or therapeutic maneu- 
vers. Skilled clinicians draw extensively on 
personal experience, and new physicians soon 
realize that good clinical judgment is based as 
much on an the ability to reason effectively 
and appropriately about what to do as it is 
on formal knowledge of the field or access 
to high-quality patient data. Thus, clinicians 
must develop a strategic approach to the selec- 
tion and interpretation of diagnostic tests, 
understand ideas of sensitivity and specificity, 
and be able to assess the urgency of a situa- 
tion. Similar issues relating to test or treat- 
ment selection, in terms of costs, risks, and 
benefits, must be understood. Awareness of 
biases and of the ways that they can creep into 
problem-solving also are crucial (see >» Chap. 
3). Also, as noted above, patient preferences 
and concerns must be adequately addressed 
as part of the decision-making process. Thus, 
good communication and interaction with 
the patient is essential. This brief review of 
issues central to clinical decision-making 
serves as a fitting introduction to the topic of 
computer-assisted decision-making: Precisely 
the same topics are pertinent when we develop 
a computational tool for CDS. The program 
must have access to good data, it must have 
extensive background knowledge encoded for 
the clinical domain in question, and it must 
embody an intelligent approach to problem- 
solving that is sensitive to requirements for 
proper analysis, appropriate cost-benefit 
trade-offs, and efficiency. 


24.2 Motivation for 
Computer-Based CDS 


Since the 1960s, workers in biomedical infor- 
matics have been interested in CDS systems 
because of a desire both to improve health 
care and to understand better the process of 
medical decision-making. Building a com- 
puter system that attempts to process clini- 
cal data to offer situation-specific advice can 
provide insight into the nature of medical 
problem solving and can enable the creation 


of formal models of clinical reasoning. At 
the same time, construction of such systems 
offers obvious societal benefits if the com- 
puter programs can aid practitioners in their 
care of patients and can lead to better clinical 
outcomes. Although the more academic con- 
siderations have provided strong motivation 
for work in the area of computer-based deci- 
sion aids over several decades, the recognition 
of the importance of CDSSs as practical tools 
has increased markedly in recent years as a 
result of the inexorable growth in health-care 
complexity and cost, as well as the introduc- 
tion of new health-care legislation, regulatory 
initiatives, and payment incentives aimed at 
addressing these trends—which have all made 
the development and broad adoption of CDS 
technology a priority. 

The twenty first century has seen changes 
in health-care practices that make the devel- 
opment of CDS technology particularly 
necessary. Computer-based CDS has taken 
on increasing urgency for four reasons: (1) 
increasing challenges related to knowledge 
and information management in clinical prac- 
tice, thus increasing physician information 
needs, (2) the ubiquity of electronic medical 
records and the desire to enhance health care 
through the communication and integration 
of the relevant data, (3) the goal of deliver- 
ing increasingly personalized health-care ser- 
vices—tailored to the patient’s preferences for 
care and to his or her individual genome, and 
(4) the growing evidence that CDS can not 
only improve the quality of care delivered but 
also reduce costs of care. We consider these 
four reasons in the sections that follow. 

We note another factor that is shaping 
new directions for CDS which we will explore 
further at the end of this chapter—the trend 
toward use of personal devices and apps, as 
well as more portable, distributed laboratory 
or analytic procedures, to provide a growing 
range of sources of data to incorporate into 
decision making. These devices and apps also 
provide an ability to track, monitor, and inter- 
vene early whenever needed, both interacting 
with the user/patient directly, and also for those 
conditions warranting it, with the provider. 
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24.2.1 Physician Information 
Needs and Clinical Data 


Management 


Modern health care is characterized by an 
ever-expanding knowledge base of clinical 
medicine, and by a growing clinical data base 
describing every patient characteristic from 
phenotype to genotype (Kohn et al. 2002). 
Despite the growing amounts of data and 
knowledge with which physicians need to 
work, health-care practitioners have seen the 
average time for a clinical encounter steadily 
curtailed, particularly in the United States, 
where the pressures of the prevalent fee-for- 
service reimbursement system and a concomi- 
tant rise in the amount of paperwork required 
for administrative management and billing 
continue to squeeze practitioners (Baron 
2010). Studies of information needs among 
physicians in clinical practice have long 
revealed that unanswered clinical questions 
are common in ambulatory clinical encoun- 
ters, with as many as one or two unanswered 
clinical questions about diagnosis, therapy, 
or administrative issues arising in every visit 
(Covell et al. 1985). Prior to the broad adop- 
tion of EHRs, in as many as 81% of clini- 
cal encounters in ambulatory care, clinicians 
were found to be missing critical informa- 
tion, with an average of four missing items 
per case (Tang et al. 1994, 1996). Currently, 
even with an EHR, providers may face major 
challenges in accessing relevant information, 
acquiring a complete picture of the patient’s 
clinical state and history, and knowing what 
further testing or therapeutic actions are best 
to take. Prior to broad EHR adoption, stud- 
ies suggested that as many as 18% of medical 
errors might be due to inadequate availability 
of patient information (Leape 1994). Today, 
conversely, the overabundance of information 
in the EHR can lead to ‘information chaos’ 
and may make it difficult for the clinician to 
find relevant information for clinical decision 
making (Melnick et al. 2019; Beasley et al. 
2011). The demands for increased informa- 
tion management - in the setting of an ever- 
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expanding clinical knowledge base, and now 
as more data sources are available and use 
of EHRs is almost ubiquitous — are primary 
drivers for the adoption of CDS systems. (See 
> Chap. 23 for a deeper discussion of physi- 
cian information needs.) 


24.2.2 EHR Adoption 
and Integration of CDS 


The motivations for adoption of EHRs and 
CDS are affected by the way in which health 
care is financed and paid for, by the structure 
and organization of the health-care system of 
a nation or a region, and by political forces that 
can create constraints, regulations, and incen- 
tives. In this section, we use the United States 
as an example to demonstrate these influences. 

Health-care safety and quality concerns, 
coupled with a seemingly inexorable rise in 
health-care costs, have led in the United States 
to a variety of cost-containment and quality- 
improvement strategies in recent years. 
Health-care delivery in the United States is 
in the midst of a profound transformation, 
in part due to Federal public policy efforts 
to encourage the adoption and use of health 
information technology (HIT). The American 
Recovery and Reinvestment Act (ARRA) of 
2009, and the HITECH regulations within 
it, created incentives for the widespread 
adoption of health information technologies 
(Blumenthal 2009; see » Chap. 29). These 
public policy efforts, while ultimately suffer- 
ing some reversals for political reasons, are 
often viewed as a long-term adjunct to current 
health-care-payment reform efforts in the 
U.S., and a prelude to additional health-care- 
delivery redesign, payment reform, and cost 
containment. As recently as 2012, only 34.8% 
of physicians in ambulatory practice in the 
US used a basic or comprehensive electronic 
medical record (Decker et al. 2012), and only 
26.6% of U.S. hospitals used health informa- 
tion technologies in inpatient care-delivery set- 
tings (DesRoches et al. 2012), although these 
numbers have had a rapid upward trajectory. 
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The ARRA and HITECH policies, and the 
resulting technology adoption, changed the 
practice of medicine and clinical care delivery 
in both beneficial and untoward ways (Sittig 
and Singh 2011). 

To achieve meaningful and effective use of 
HIT, the software must be viewed as one com- 
ponent of a complex sociotechnical system, 
in which all elements must work effectively 
(Institute of Medicine 201 1a). The movement 
toward value-based reimbursement, a more 
recent U.S. trend, encourages emphasis on 
wellness, prevention, and early intervention in 
disease processes, and requires a new level of 
connectivity, continuity, and coordination of 
care across multiple venues is gaining in accep- 
tance. A prominent example is the advent of 
Accountable Care Organizations (ACOs; 
McClellan 2015), which require emphasis on 
not only integration of data from all sources 
(clinical and financial) but also aggregate, 
population-based data on outcomes, costs, 
and identification of outliers. These alterna- 
tive payment models are distributing upside 
and downside financial risk based upon qual- 
ity outcomes, patient experience, and costs, 
and thus call for CDSSs to support both pay- 
ers and providers. 

One of the principal motivations for EHR 
adoption is to provide an infrastructure with 
which to improve the quality, safety, and 
efficacy of health-care delivery. In the past 
decade, the U.S. government placed consider- 
able emphasis on the adoption of quality mea- 
sures and quality-reporting requirements as 
part of meaningful use of HIT (Clancy et al. 
2009; Institute of Medicine 201 1a). Quality 
measures, despite their ability to provide feed- 
back that stimulates improved performance 
by the clinician, are only part of the process 
needed to make the desired improvements. 
Prospective, proactive clinical decision sup- 
port must also be in place. The U.S. govern- 
ment’s rules for meaningful use of HIT, which 
became progressively more demanding over a 
4-6 year period, required only minimal CDS 
compliance in phase I and II, but Phase III 
of the meaningful use regulations in 2016 was 
intended to increase the mandate for CDS in 
EHR systems substantially by requiring APIs 
(application program interfaces) for access to 


EHR data (Blumenthal and Tavenner 2010; 
Adler-Milstein et al. 2017). More recently, the 
Medicare Access and CHIP Reauthorization 
Act of 2015 (MACRA),! the Twenty-first 
Century Cures Act of 2016 in the United 
States,” and other initiatives have promoted 
a wide array of additional enhancements and 
improvements to the use of health informa- 
tion technology in clinical practice. Notably, 
the Twenty-first Century Cures Act removed 
from U.S. Food and Drug Administration 
(FDA) consideration as a medical device soft- 
ware devoted to performing the functions of 
an EHR, administrative tools, or providing 
clinical decision support. Significant push- 
back to the burden on both EHR vendors 
and on health-care organizations to comply 
with Meaningful Use Phase III resulted in 
relaxation of these constraints. Meaningful 
Use regulations have been superseded in the 
United States by the Cures Act, promot- 
ing more sweeping goals for interoperability 
and access to quality measures (Sinsky and 
Privitera 2018; see» Chap. 29). More recently, 
political considerations regarding health-care 
financing in the United States have made the 
speed of adoption of such measures some- 
what uncertain. 

From a more worldwide perspective, 
interoperability of data and of CDS knowl- 
edge models are essential for wide adoption, 
as well as for improving the evidence base 
from which knowledge is derived. 


24.2.3 Precision Medicine 


The fundamental model for the practice of 
medicine has undergone dramatic change in 
the past century or so. The objectives of clini- 


1 MACRA (2015). The Medicare Access and CHIP 
Reauthorization Act of 2015 (MACRA). Retrieval 
February 19, 2020: » https://www.cms.gov/medi- 
care/quality-initiatives-patient-assessment-instru- 
ments/value-based-programs/macra-mips-and- 
apms/macra-mips-and-apms.html 

2 Twenty-First Century Cures Act. H.R. 34, 114th 
Congress (2016). Retrieval February 19, 2020: 
> https://www.gpo.gov/fdsys/pkg/BILLS- 
114hr34enr/pdf/BILLS-1 14hr34enr.pdf 
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cal care have shifted radically from the archaic 
goal of correcting putative imbalances of 
bodily humors to the scientific understand- 
ing of pathophysiology, of mechanisms for 
eliminating pathogens, and of remedying 
biological aberrancies. The resulting view 
of medicine as the application of biologi- 
cal principles was at the core of the report 
produced by Abraham Flexner (1910) that 
upended medical education in the early twen- 
tieth century and that had led to the reduc- 
tionist biomedical model that prevailed for the 
rest of that century. More recently, however, 
George Engel’s biopsychosocial model (Engel 
1977) brought to the fore of clinical care the 
need to address psychological and social fac- 
tors in clinical treatment plans in addition to 
underlying biomedical problems. By the end 
of the twentieth century, it became increas- 
ingly accepted that CDS requires not only the 
rapid communication of appropriate scientific 
medical knowledge, but also the adaptation 
of that knowledge to reflect the psychologi- 
cal and social situation that would temper the 
application of the knowledge. Added to this 
complexity is the aging of the population, 
owing in part to advances in health and health 
care. The result has been a much higher bur- 
den of chronic diseases, multiple diseases, and 
multiple testing and treatment options, with 
both their positive and negative consequences 
that must be balanced—all contributing to the 
increasing intricacy of care. Indeed, the view 
of care is so complex for some patients that 
a major role of CDS is to integrate models 
of patient state and context (provider, set- 
ting, specialization, and activity) to provide 
selective visualization, analysis, and decision- 
making support for optimal care management 
(Greenes et al. 2018). 

As a further extension of these trends, the 
genomic era in which we now live has fur- 
ther increased the need for clinical practice 
to reflect precision medicine and the need to 
tailor care to individual factors in ways that 
never before were imaginable (Ginsburg and 
Willard 2009). Precision medicine is charac- 
terized by decision making that may take into 
account patient personal history, family his- 
tory, social and environmental factors, along 
with genomic data and patient preferences 
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regarding their own care (see > Chap. 26). 
In this approach, clinical decision making is 
explicitly patient-centered in new ways, bring- 
ing the best evidence at the genetic level to 
bear on many clinical scenarios, while incor- 
porating patient preferences for acquiring and 
applying genetic information (Fargher et al. 
2007; Marcial et al. 2018). The increasing use 
of genomics in medicine (Chan and Ginsburg 
2011) is generating data that outstrip the 
information and knowledge-processing capa- 
bilities of practitioners, and many clinicians 
feel threatened by the impending tsunami 
of additional knowledge that they will need 
to master (Baars et al. 2005). As precision 
medicine becomes the norm, primary-care 
and specialist practitioners alike will need to 
manage their patients by interpreting genomic 
tests along with myriad other data at the point 
of care. It is hard to imagine how clinicians 
will manage to perform such activities with- 
out substantial computer-based assistance. 
Informatics is well suited to support a person- 
alized approach to clinical genomics (Ullman- 
Cullere and Mathew 2011). 

As mentioned earlier, another related 
change is growing recognition of the impor- 
tance of promoting optimal health and 
wellness, not just by treating disease but by 
encouraging healthy lifestyles, fostering com- 
pliance with health and health-care regimens, 
and carrying out periodic health-risk assess- 
ments. Key to prescriptive medicine are tools 
to support prospective medicine (Langheier 
and Snyderman 2004)—assisting the acqui- 
sition of a detailed family history, social 
history, and environmental history, using per- 
sonal apps and sensors, providing health-risk 
assessments, and managing genomic informa- 
tion (Hoffman and Williams 2011; Overby 
et al. 2010). 


24.2.4 Savings Potential 
with Health IT and CDS 


CDS has been shown to influence physician 
behavior (Colombet et al. 2004; Lindgren 
2008; Schedlbauer et al. 2009), diagnostic 
test ordering and other care processes (Bates 
and Gawande 2003; Blumenthal and Glaser 
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2007), and the costs of care (Haynes et al. 
2010), and it may have a modest impact on 
clinical outcomes (Bright et al. 2012). While 
there is enormous promise for HIT and CDS, 
their implementation is not without potential 
peril: HIT poorly designed or implemented, 
or misused, can generate unintended conse- 
quences (Ash et al. 2007; Harrison et al. 2007; 
Bloomrosen et al. 2011), and introduce new 
types of medical errors (Institute of Medicine 
201 1a). 

Only a handful of studies have examined 
the return on investment (ROI) for HIT, and 
even fewer have investigated ROI for decision- 
support specifically. The value of CDS in 
terms of ROI is difficult to measure. Isolated 
studies of various hand-crafted systems in 
academic centers have shown value (Wang 
et al. 2003; Kaushal et al. 2006), but adoption 
elsewhere has often been problematic. Broad 
adoption has not occurred, for many reasons 
discussed later in this chapter, including the 
proprietary nature of systems for CDS and 
for representation of knowledge, the lack of 
interoperability of data and knowledge, the 
mismatch of CDS to workflow, and usability 
concerns. 

Systematic reviews of the scientific lit- 
erature, such as the one performed by Bright 
et al. (2012), have not been able to demon- 
strate an effect of CDS on patient outcomes 
except in the short term. This finding is not 
surprising, however, because the time point at 
which CDS occurs is often long before a final 
outcome, and many intervening factors may 
have a greater effect. In the case of CDS for 
many chronic diseases whose complications 
ensue over years or decades, it simply may 
be impractical to continue longitudinal stud- 
ies of sufficiently long duration to be able to 
measure meaningful differences in outcome. 
Nevertheless, economic simulation studies of 
the potential effect of CDS on chronic dis- 
eases have demonstrated benefit in the long 
term (McClellan 2015; Bu et al. 2007; Adler- 
Milstein et al. 2007). 

Historically, the adoption of CDS technol- 
ogy has been motivated by a virtuous desire to 
enhance the performance of clinicians when 
dealing with complex situations. The recent 


advent of legal, regulatory, and financial driv- 
ers, as well as the increasing importance of 
personalizing medical decision making on 
the basis of genomic data, now make CDS an 
essential element of modern clinical practice. 


24.3 Methods of CDS 


As we have already noted, CDS systems 
(1) may use information about the current 
clinical context to retrieve pertinent online 
documents; or (2) they may provide patient- 
specific, situation-specific alerts, reminders, 
physician order sets, or other recommen- 
dations for direct action; or (3) they may 
organize information in ways that facilitate 
decision making and action. Category (2) 
largely consists of the various computer- 
based approaches (“classic” CDS based on 
the application of intelligent systems) that 
have been the substrate for work in informat- 
ics since the advent of applied work in proba- 
bilistic reasoning and artificial intelligence in 
the 1960s and 1970s. Such systems provide 
custom-tailored assessments or advice based 
on sets of patient-specific data. They may fol- 
low simple logics (such as algorithms), they 
may be based on decision theory and cost- 
benefit analysis, or they may use probabilistic 
approaches or derive their conclusions on the 
basis of machine learning from large amounts 
of data. Some diagnostic assistants (such as 
DXplain; Barnett et al. 1987) suggest differ- 
ential diagnoses or indicate additional infor- 
mation that would help to narrow the range 
of etiologic possibilities. Other systems sug- 
gest a single best explanation for a patient’s 
symptomatology. Other systems interpret 
and summarize the patient’s record over time 
in a manner sensitive to the clinical context 
(Shahar and Musen 1996). Still other systems 
provide therapy advice rather than diagnostic 
assistance (Musen et al. 1996). 

CDS systems can achieve their results 
using a wide variety of computational meth- 
ods. These approaches include Bayesian 
probabilistic reasoning (see > Chap. 3), the 
use of machine learning to make predic- 
tions based on large amounts of data (often 
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from the EHR), performing inference using 
IF-THEN rules, or by the identification of 
relevant templates for the clinician to fill in, 
such as knowledge-based groupings of phy- 
sician orders (order sets), or some combina- 
tion of these approaches (James et al. 2013). 
CDS systems may acquire the data on which 
they base their recommendations interactively 
from users or directly from a health informa- 
tion system (or some combination of these 
approaches). We now discuss the issues that 
drive CDS system design, and we highlight 
how these issues are manifest in current clini- 
cal decision aids. 


24.3.1 Acquisition and Validation 


of Patient Data 


As mentioned in the introduction to this 
chapter, a prerequisite to any decision-making 
process is having available all the data that are 
needed to perform the required actions. As 
emphasized in > Chap. 2, few problems are 
more challenging than the development of 
effective techniques for capturing patient data 
accurately, completely, and efficiently. You 
have read in this book about a wide variety of 
techniques for data acquisition, ranging from 
keyboard entry, to speech input, to methods 
that separate the clinician from the computer 
(such as scannable forms, real-time data mon- 
itoring, and intermediaries who transcribe 
written or dictated data for use by computers). 

The problems of data acquisition go 
beyond entry or extraction from the EHR, 
or from other repositories, of the data them- 
selves, however. A primary obstacle is that 
we lack standardized ways of expressing 
most clinical situations in a form that com- 
puters can interpret. As discussed in detail 
in » Chap. 7, there are several controlled 
medical terminologies that health-care work- 
ers use to specify precise diagnostic evalua- 
tions (e.g., the International Classification of 
Diseases and SNOMED CT), clinical proce- 
dures (e.g., Current Procedural Terminology 
and LOINC codes), drugs (e.g., RxNorm), 
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and so on. Still, there is no controlled termi- 
nology that can capture all the nuances of a 
patient’s history of present illness or findings 
on physical examination. There is no cod- 
ing system that can reflect all the details of 
physicians’ or nurses’ progress notes. Given 
that much of the information in the medical 
record that we would like to use to drive deci- 
sion support is not available in a structured, 
machine-understandable form, there are clear 
limitations on the data that can be used to 
assist clinician decision-making. The prose of 
progress notes, consultation notes, procedure 
or operation reports, discharge summaries, 
and other documents contains an enormous 
amount of information that never makes it 
to the coded part of the EHR. Nevertheless, 
even when computer-based patient records 
store substantial information only as free- 
text entries, the data that are also available 
in coded form (typically, diagnosis codes and 
prescription data) can be used to significant 
advantage (van der Lei et al. 1991). 

The desire to access information from the 
EHR that may be available only in text has 
been a topic of great interest to the CDS 
community. Some information systems pro- 
vide options for structured data entry, ask- 
ing clinicians to use fill-in-the-blanks forms 
or templates on the computer screen to enter 
patient-related information that otherwise 
would be entered as part of a prose note. In 
general, providers have resisted such human- 
computer interfaces, often finding it restrictive 
and cumbersome to make selections from pre- 
defined menus when they would much rather 
express themselves more freely in prose. In 
fact, structured templates or methods for col- 
lecting information about a particular prob- 
lem or finding are themselves often regarded 
as a form of CDS, in that they provide an 
organized framework and a mnemonic func- 
tion. Fortunately, work in natural language 
processing has made major advances in recent 
years, making it increasingly possible to mine 
the textual notes of EHRs to identify infor- 
mation that might bear on the CDS process 
(see > Chap. 8). 
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24.3.2 Decision-Support 
Methodologies 


When designing a CDS system, it is helpful 
to consider several aspects (@ Fig. 24.1). We 
consider these aspects as components inter- 
acting with one another, because each requires 
thought and effort to accomplish, and can be 
facilitated by developing standard, interop- 
erable approaches for implementing them. 
Further, they can be independently enhanced 
over time, and are able to be shared and 
reused if engineered in a component-based 
manner. Much CDS can be accomplished by 
direct implementation and embedding into a 
clinical system, but the result is a set of one- 
off implementations that cannot be readily 
maintained, updated, or shared, especially of 
concern given that an organization may have 
hundreds, if not thousands, of CDSS artifacts 
in operation. 


There are five aspects of CDS that, at least 
in principle, can be viewed as independent of 
each other, and that work together to produce 
CDS capability (Greenes 2014). These aspects 
include: (1) the method of computation or 
inferencing or, more generally, execution of 
the CDS function (e.g., targeted information 
retrieval, hard-coded algorithms, Bayesian 
estimation, neural-network classification, 
rule-logic evaluation); (2) the knowledge 
needed to carry out the function (e.g., prior 
and conditional probabilities, rule assertions, 
mathematical formulae, clinical guidelines); 
(3) the information model that governs how 
data are provided to the CDSS (1.e., the data 
needed and the method of encoding, such as 
specific FHIR data from the clinical setting, 
laboratory-test results encoded in LOINC, 
environmental data, or medical facts in spe- 
cific coding schemes); (4) the type of recom- 
mendation to be provided (e.g., prediction, 
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O Fig. 24.1 A conceptual model of CDS components. 
There are five aspects of CDS that work together to pro- 
duce CDS capability. The aspects are: (1) the method of 
computation or inferencing; (2) the knowledge needed 
to carry out the computation; (3) the information model 
for the data that drive decision making; (4) the type of 


recommendation to be provided; and (5) how the pro- 
cess interacts with the application environment, includ- 
ing how it is invoked and how the data and 
recommendations are communicated. (Adapted from 
Greenes 2014) 
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action, classification, relevant citations); and 
(5) how the process interacts with the applica- 
tion environment, including how it is invoked 
(e.g., a process launched by an event monitor, 
by the user explicitly, or by beingembedded in 
workflow) and how the data and recommen- 
dations are communicated (e.g., provided via 
a FHIR API or delivered as a popup alert). In 
this section, we focus primarily on the meth- 
ods of computation required for CDS (Aspect 
1 of B Fig. 24.1), and to a lesser extent on 
the knowledge needed during computation 
(Aspect 2). We consider the other elements in 
subsequent sections. 

Aspect 1 was central to much of the early 
work on CDS development, beginning in the 
1960s and largely led by academic centers, 
which focused on developing and exploring 
different computational models of CDS in 
the absence of any health IT infrastructure in 
which to embed the systems. In recent years, 
much of CDS has been built by vendors of 
proprietary systems, giving rise to a range of 
different approaches, often embedded in those 
systems, and with methods that are less able 
to be inspected, formalized, and shared. Some 
of the trends in health-care delivery that we 
cited in > Sect. 26.2 are hoped to give rise 
to increased interoperability and sharing in 
vendor-developed systems. Even if current 
commercial products may not manifest the 
different components of @ Fig. 26.1 dis- 
tinctly, we adopt this conceptual focus in our 
discussion of various strategies for CDS to 
clarify the underlying principles. 


24.3.2.1 Context-Specific 


Information Retrieval 
Early developers of CDS systems argued that 
decision support entails more than identify- 
ing relevant information that can help a clini- 
cian to solve a problem; it was believed that 
a CDS system has to suggest specifically how 
the problem should be solved. Thus, simply 
providing information for the user to read and 
digest was not considered “real” decision sup- 
port. Nevertheless, as information-retrieval 
systems have improved over the years— with 
better performance characteristics—it is 
often hard to maintain that such systems do 
not offer decision support in a genuine sense. 
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Indeed, such systems are now ubiquitous in 
health care. (> Chap. 25 provides a compre- 
hensive discussion of information-retrieval 
methods.) 

The simplest, and perhaps most com- 
mon, form of CDS uses contextual informa- 
tion from an EHR to perform information 
retrieval from a database of information 
about online documents. A person view- 
ing data in an EHR may see selectable icons 
(infobuttons) next to the names of drugs, 
laboratory tests, patient problems, or other 
elements of the patient record, or the items 
themselves may be hyperlinked to an informa- 
tion retrieval engine. Clicking on an infobut- 
ton causes the clinical information system to 
perform a query on the database, providing 
the user with one or more immediately acces- 
sible resources that can offer more informa- 
tion about the item in question. Alternatively, 
the system may automatically query one or 
more of those external resources and return 
the results of the queries for display (Cimino 
et al. 2002). Clicking on an infobutton next to 
a drug, for example, might allow the user to 
access information about customary dosing, 
side effects, or alternative medications (see 
O Fig. 14.15 in > Chap. 14). The query that 
retrieves the links to the documents is tailored 
based on whatever is next to the infobutton 
icon on the screen. The query may also take 
into account contextual information, such 
as patient-related data, the activity in which 
the user is engaged, and the role of the user 
in the health-care enterprise (physician, nurse, 
patient, and so on). 

An infobutton manager mediates the que- 
ries between the clinical information sys- 
tem and the available information resources. 
The standards development organization 
Health Level Seven has created a standard 
for “context-aware knowledge retrieval,” lead- 
ing to infobutton managers that have been 
adopted by many commercial EHR vendors.’ 


3 HL7 International (2014). HL7 Version 3 Standard: 
Context Aware Knowledge Retrieval Application 
(“Infobutton”), Knowledge Request, Release 2. 
Retrieval February 19, 2020: >  https://www.hl7. 
org/implement/standards/product_brief. 
cfm?product_id=208 
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Infobutton managers need to anticipate how 
the clinical context might tailor the specific 
query performed by any given infobutton, so 
that the result of the query is highly precise 
and relevant to the situation at hand. Detailing 
specifically how contextual information might 
alter the queries performed by each infobutton 
type can be tedious, and the process requires 
developers to be adept at second-guessing all 
the reasons that might cause a user to click on 
a particular infobutton. Current research con- 
centrates on the development of a Librarian 
Infobutton Tailoring Environment (LITE; Jing 
et al. 2015) that promises to aid the authoring 
of infobutton queries via “wizards” and other 
user-interface conveniences. 

Although infobuttons are unquestion- 
ably important knowledge resources, many 
people would argue that they are not true 
CDS systems. Infobuttons retrieve relevant 
information for a user, but they do not explic- 
itly address particular decisions that the user 
needs to make. The possible reasons that a 
user might click on an infobutton are folded 
into the query specification at the time that 
the infobutton is created; at runtime, of 
course, there is no way for the system to know 
exactly why the user selected the infobut- 
ton. Infobutton managers therefore require 
sophisticated query capabilities, but they do 
not need to reason from a clinical situation to 
a particular recommendation. 

When the goal is to generate a situation- 
specific recommendation regarding diagnosis 
or therapy, developers need to turn to meth- 
ods that can perform some kind of inference. 
The sophistication of the required technique 
is a function of the kind of inference that is 
necessary to render a result for the user. 


24.3.2.2 Organizing or Grouping 
Information as a CDS 
Method 


As we have noted earlier, a valuable kind of 
CDS is to organize information to provide a 
ready collection of items that need to be con- 
sidered together (e.g., order sets for particular 
clinical indications or settings, documentation 
templates for particular purposes, or structure 


and formatting of reports). This capability 
serves not only as a convenience to the user, 
but also as a mnemonic function (i.e., offering 
a check list; Gawande 2009). 

When patients are admitted to the hospital 
with a particular condition such as a suspected 
myocardial infarction or pneumonia, when 
hospital staff must prepare them for diagnos- 
tic procedures or surgery, or when they need 
to be transferred to another care team or to 
be discharged home, there often are stereo- 
typical groups of medical orders that physi- 
cians tend to request. For example, patients 
with possible pneumonia often require a chest 
x-ray examination, the recording of vital signs 
at certain intervals, the administration of sup- 
plementary oxygen, cultures of their sputum 
or blood, and administration of antibiotics. 
When patients are admitted to a hospital with 
the diagnosis of pneumonia, the EHR can 
automatically suggest to the treating physi- 
cians that such a set of orders be considered. 
Systems that produce such order sets can 
use the clinical situation to tailor the recom- 
mended orders (e.g., the computer may not 
recommend a chest x-ray examination if the 
patient has just had one; knowledge that the 
patient has been placed on an artificial venti- 
lator may trigger a separate set of associated 
orders for consideration). 

Clinicians routinely face many other ste- 
reotypical tasks, such as describing the results 
of a diagnostic procedure such as an imag- 
ing study or reporting the sequence of events 
that took place during a surgical operation. 
Automated systems that recognize the clinical 
context can provide a tailored template for the 
clinician to fill in, helping to ensure that infor- 
mation is provided accurately and completely. 

Throughout their professional activi- 
ties, clinicians constantly must keep in mind 
large amounts of information when treating 
patients with even routine medical problems 
and reporting the results of their work. The 
use of simple mnemonics such as check lists 
can be remarkably effective in ensuring that 
care givers remember everything that they 
need to do or to report in a given context 
(Algaze et al. 2016; Alamri et al. 2016; Pageler 
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et al. 2014). Transforming such check lists 
into groups of orders to consider, groups of 
features to note in a diagnostic evaluation, or 
groups of steps that may have been followed 
when performing a surgical procedure can 
help clinicians to remember important details 
and to improve both the quality of medical 
care and what clinicians may report about it. 
Computationally, systems that offer order 
sets or templates to clinicians typically assem- 
ble predefined information from collections 
of text strings or database entries. The HL7 
organization has issued a standard for creat- 
ing libraries of order sets. Systems that sug- 
gest specific order sets to users by making 
selections from such a library are generally 
embedded within the EHRs that call on their 
services, however. Thus, the specific methods 
that these systems use tend to be proprietary. 
Researchers have experimented with the 
use of machine-learning techniques to create 
order sets on the fly based on empirical data. 
For example, Wang et al. (2018) have built a 
system that infers groups of orders for par- 
ticular situations by examining a data ware- 
house for the individual orders that physicians 
have administered historically. The system 
then suggests that the groups of orders that it 
discerns from the database may be reasonable 
for clinicians to administer as an ensemble. 


24.3.2.3 Hardcoding Clinical 
Algorithms 

From a computational perspective, there is 
nothing simpler than encoding an algorithm 
directly in a computer program. In health care, 
there is generally nothing simpler than defin- 
ing a decision process in terms of a flowchart. 
Numerous CDS systems thus have taken 
problem-specific flowcharts designed by clini- 
cians and encoded them for use by a computer. 
Although such flowcharts have been useful for 
the purpose of triaging patients in urgent-care 
situations and as a didactic technique used in 
journals and books where an overview for a 
problem’s management has been appropriate, 
computable interactive flowcharts have been 
largely rejected by physicians as too simplistic 
or generic for routine use (Grimm et al. 1975), 
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other than for particular kinds of computa- 
tions such as drug dose calculations (e.g., for 
pediatric patients). In addition, the advantage 
of their implementation on computers has not 
been clear; the use of simple printed copies of 
the algorithms generally has proved adequate 
for clinical care (Komaroff et al. 1974). A 
noteworthy exception that gained enormous 
attention in the early 1970s was a computer 
program deployed in Boston at what was then 
the Beth Israel Hospital (Bleich 1972); it used 
detailed algorithmic logic to provide advice 
regarding the diagnosis and management of 
acid—base and electrolyte disorders. More 
recently, such branching-logic and other infer- 
ence methods approaches have been widely 
adopted in the administrative information 
systems that third-party payers use to process 
requests to pre-certify payment for expensive 
services such as MRI studies and elective sur- 
gery (the HL7 Da Vinci Project is a notable 
body of work in this area). 

Although representing clinical algorithms 
simply as computer code offers a very direct 
approach to implementing a CDSS, there 
are obvious challenges that occur when it 
becomes necessary to refine or update the pro- 
gram’s behavior. Every modification requires 
reprogramming the system. It may not always 
be obvious how to reprogram the system to 
render the desired behavior and making the 
necessary changes in one part of the pro- 
gram may have unintended consequences 
when other parts of the program execute. 
Thus, although it may seem appealing simply 
to hardcode clinical algorithms, developers 
of decision support systems generally seek 
more flexible mechanisms to represent clinical 
knowledge and to reason about it. We discuss 
this matter further in > Sect. 24.4.5, and we 
address future research directions in this area 
in the Conclusion. 


24.3.2.4 Learning from Data 

Considerable flexibility is achieved when the 
CDSS is largely data-driven. The advent of 
extraordinarily fast computers and the abil- 
ity to process enormous amounts of data has 
led to an explosion of interest in the use of 
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large datasets to learn patterns in the data 
to support clinical decision making (James 
et al. 2013). The success of such data-driven 
methods has led to considerable excitement 
about the use of “big data” in health care. It is 
important to appreciate, however, that work- 
ers in biomedical informatics benefited from 
such approaches long before computers could 
manage the enormous datasets that they pro- 
cess today. 


Probabilistic Systems 


Attempts to drive computer-based decision 
support from relationships inferred directly 
from data began during the earliest days of 
research in biomedical informatics. A seminal 
article in 1959 first introduced Bayes theorem 
and also value theory (later known as util- 
ity theory) into health-care decision making 
(Ledley and Lusted 1959). In the 1960s, work- 
ers in the field recognized that they could use 
computers to apply Bayes’ rule to determine 
the posterior probability of diseases based on 
observations of patient-specific parameters 
(see > Chap. 3). Such calculations were based 
on the determination of appropriate probabi- 
listic relationships between findings and dis- 
eases by analyzing available datasets. Large 
numbers of Bayesian diagnosis programs have 
been developed in the intervening years, many 
of which have been shown to be accurate in 
selecting among competing explanations of a 
patient’s disease state. 

Among the most significant of the early 
experiments were those of F. T. de Dombal 
and his associates (1972) in England, who 
focused on the diagnosis of acute abdomi- 
nal pain. De Dombal’s group used a naive 
Bayesian model that assumed that there are 
no conditional dependencies among findings 
(i.e., a model that makes the inappropriate 
but convenient assumption that the presence 
of a finding such as upper abdominal pain 
never affects the likelihood of the presence 
of a finding such as lower abdominal pain). 
Using surgical or pathologic diagnoses as the 
gold standard, de Dombal’s group used sensi- 
tivity, specificity, and disease-prevalence data 
for various signs, symptoms, and test results 
to calculate, using Bayes’ theorem, the proba- 


bility of seven possible explanations for acute 
abdominal pain (appendicitis, diverticulitis, 
perforated ulcer, cholecystitis, small-bowel 
obstruction, pancreatitis, and nonspecific 
abdominal pain). To keep the Bayesian com- 
putations manageable, the program made the 
“naive” Bayesian assumptions of (1) condi- 
tional independence of the findings for the 
various diagnoses, (2) mutual exclusivity, and 
(3) exhaustiveness of the seven diagnoses (see 
> Chap. 3). 

In one system evaluation (de Dombal 
et al. 1972), physicians filled out data sheets 
summarizing clinical and laboratory findings 
for 304 patients who came to the emergency 
department with abdominal pain of sudden 
onset. The data from these sheets provided 
the attributes that were analyzed using Bayes’ 
rule. Thus, the Bayesian formulation assumed 
that each patient had one of the seven condi- 
tions and it selected the most likely one on the 
basis of the recorded observations. 

In contrast to the clinicians’ diagnoses, 
which were correct in only 65-80% of the 304 
cases (with accuracy depending on the indi- 
vidual clinician’s training and experience), the 
program’s diagnoses were correct in 92% of 
cases. Furthermore, in six of the seven disease 
categories, the computer was more likely to 
assign the patients to the correct disease cat- 
egory than was the senior clinician in charge 
of the case. 

De Dombal’s system began to achieve 
widespread use—from emergency depart- 
ments in other countries to the British subma- 
rine fleet. Surprisingly, however, the system 
never obtained the same degree of diagnostic 
accuracy in other settings that it did where 
it had initially been deployed—even when 
adjustments were made for differences in prior 
probabilities of disease. There are several rea- 
sons possible for this discrepancy, which are 
relevant for all Bayesian CDS systems. The 
most likely explanation is that there may be 
considerable variation in the way that clini- 
cians interpret the data that must be entered 
into the computer. For example, physicians 
with different training or from different cul- 
tures may not agree on the criteria for identi- 
fication of certain patient findings on physical 
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examination, such as “rebound tenderness.”* 
Another possible explanation is that there are 
different probabilistic relationships between 
findings and diagnoses in different patient 
populations. 

Although a naive Bayesian model may 
have limitations in accurately modeling a 
diagnostic problem, a major strength of this 
approach is computational efficiency. When 
the findings that bear on a hypothesis are 
assumed to be conditionally independent, 
then the order in which the findings are con- 
sidered in the Bayesian analysis does not 
matter. The computer starts by considering 
a given finding, the prior probability of each 
possible diagnosis under consideration (gen- 
erally the prevalence of each diagnosis in the 
population), and the conditional probabilities 
of the finding (or the absence of the find- 
ing) given each diagnosis (or the absence of 
the diagnosis)—the sensitivity and specific- 
ity of the finding (see the discussion of these 
concepts in > Chap. 2). The computer then 
applies Bayes’ rule to calculate the posterior 
probability of each diagnosis given the value 
of the finding. The computer now is poised to 
update the probability of each diagnosis given 
the value of a second finding. The prior prob- 
ability for each diagnosis in this case is not 
the prevalence of the diagnosis in the popu- 
lation, however. Having applied Bayes’ rule 
once, we have more information than we had 
at the start. We can treat the posterior prob- 
ability of each diagnosis given the first finding 
as the prior probability of the diagnosis when 
we apply Bayes’ rule a second time. When it is 
time to consider a third finding, the posterior 
probability for each diagnosis after processing 
the second finding serves as the prior prob- 
ability for the next application of Bayes rule. 
The process continues until the value of each 
finding has been considered. This sequential 
Bayes approach was explored as early as the 
1960s for the diagnosis of congenital heart 


4 Rebound tenderness is pain that is exacerbated when 
the physician presses down on the abdomen and 
then suddenly releases, generating a “rebound” 
when the abdomen returns to its baseline position. 
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disease (Gorry and Barnett 1968) and has 
been used in many CDS systems since. 

In recent years, the use of naive Bayesian 
models has been seriously challenged by the 
adoption of systems based on Bayesian belief 
networks (see below), which can take advan- 
tage of efficient algorithms that overcome 
the limiting assumptions of naive Bayesian 
approaches—albeit at the cost involved in 
creating a more complex (and more nuanced) 
model of the underlying probabilistic rela- 
tionships. In addition, systems that adopt the 
sequential Bayes approach lack the ability to 
choose the next test to be applied in a man- 
ner that can optimize reasoning. Bayesian 
systems that use more sophisticated reason- 
ing strategies based on decision analysis (see 
> Chap. 3) can use utility theory to identify 
the test that will provide the most useful infor- 
mation given the current state of reasoning. 
Considerations of cost, discomfort to the 
patient, and the availability of the test can 
influence the utility of the particular choice. 


Machine Learning 


The availability of large biomedical datasets 
and computers and algorithms that can pro- 
cess huge amounts of data are revolutioniz- 
ing health care and the life sciences. Machine 
learning is everywhere in health care, from 
interpreting radiographic images to predict- 
ing utilization of health care services to iden- 
tifying potential adverse drug events. It is not 
surprising that decision-support based on 
machine-learning models is becoming increas- 
ingly common in clinical settings. Such sys- 
tems have been in place for decades, but now 
they are assuming increasing prominence, 
given the new opportunities that the renais- 
sance in machine learning is offering all of 
biomedicine in the era of “big data.” 

There are a host of supervised learning 
techniques that can determine how data are 
associated with hypotheses (James et al. 2013), 
and that consequently can be trained on EHR 
data to infer conclusions based on some set of 
input data. For example, the decision-support 
capabilities of the patient monitoring systems 
discussed in > Chap. 21 often apply statisti- 
cal methods to the current data stream to 
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infer corresponding classifications to inform 
care providers of the patient’s current state. 
Regression analysis or more sophisticated 
techniques, such as artificial neural networks 
and support vector machines, when applied to 
routinely collected patient data, have enabled 
investigators to develop venerable decision 
aids such as the APACHE system (Knaus 
et al. 1991; Zimmerman et al. 2006), which 
offers prediction models providing prognos- 
tic information regarding patients in the ICU 
(see > Chap. 21). 

Recent work demonstrates the value of 
applying data-driven techniques to a wide 
range of clinical problems for which clini- 
cians may benefit from decision support, from 
assessing newborns in the ICU (Saria et al. 
2010) to development of models that can sug- 
gest which patients might benefit most from 
palliative care (Avanti et al. 2018). Scores of 
start-up companies have emerged in recent 
years, each hoping that different large datasets 
and specialized machine-learning techniques 
will lead to new insights about particular clin- 
ical problems in an effort to enhance decision 
making. 

From the beginning, such machine- 
learning approaches have been criticized when 
used as the basis for CDS, primarily because 
of the lack of transparency in how data-driven 
methods reach their conclusions (Shortliffe 
et al. 1979). Because the associations between 
findings and diagnoses are inferred as the 
system is trained on the data and are not 
readily available for inspection, such systems 
cannot offer guidance as to why they might 
reach particular conclusions. This inability to 
explain the basis for their recommendations is 
especially important when the recommenda- 
tions of a system might be overly fitted to the 
peculiarities of a dataset drawn from a patient 
population different from the one to which 
the system is being applied (as may have been 
the case with de Dombal’s system). Because 
the output of a CDSS based on a machine- 
learning algorithm must be accepted at face 
value, there is typically no way to know what 
biases may exist in the data that trained the 
system or what clinically relevant intermedi- 
ary states may have led the system to reach 


its final conclusion. There is currently intense 
interest in being able to develop new technol- 
ogies powered by machine-learning methods 
that in some measure can explain the basis of 
their reasoning and that can allow users to 
assess the likelihood that particular recom- 
mendations are appropriate in their clinical 
setting (Ribeiro et al. 2016). 


24.3.2.5 Declarative Representation 
of Knowledge 

In simple Bayesian systems and in those 
CDSSs based on machine-learning algo- 
rithms, data are provided as input into the sys- 
tem, and the output is a classification of the 
data—often a diagnosis on which to act. Since 
the 1970s, however, workers in biomedical 
informatics have pursued the development of 
decision-support technologies that attempt to 
encode in a more explicit way how the inputs 
to the system relate to the outputs. The goal 
is to encode models—models of reasoning, 
models of pathophysiology, models of proba- 
bilistic relationships, models of the evidence 
in support of alternative treatment options, 
and other relevant models—in a way that 
forms the basis for a system’s computation to 
derive an appropriate recommendation from 
the input data. Such systems are often built 
with the objective that the underlying models 
be examinable and explainable. Often, there is 
a desire to make those models editable, so that 
the models easily can be updated in light of 
new discoveries and new understanding. The 
unifying idea in this approach is that CDSSs 
are built with a computer-based representation 
of the knowledge that drives system behavior. 
There are many ways to represent knowledge 
in computers (Musen 2014), and each strategy 
has different strengths and weaknesses. 


Bayesian Belief Networks 

Much of the early interest in the naive, 
sequential Bayesian approach stemmed from 
a conviction that it simply was impractical 
to construct Bayesian systems in which the 
assumption of conditional independence was 
lifted: There would be too many probabilities 
to assess when building the system, and the 
necessary computation could be intractable. 
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Work on the use of belief networks, however, 
has demonstrated that it actually is realistic 
to develop more expressive Bayesian systems 
in which conditional dependencies are mod- 
eled explicitly—often by taking advantage 
of approximate algorithms for concluding 
the posterior probabilities that are compu- 
tationally efficient in most cases. (Belief net- 
works are described in detail in > Chap. 3.) 
Currently, many modern CDS systems that 
make recommendations based on probabilis- 
tic relationships use belief networks as their 
primary representation of the underlying 
clinical situation, and then “solve” the belief 
network at runtime to calculate the posterior 
probabilities of the conditions represented 
in the graph. The use of belief networks is 
popular because the formalism makes proba- 
bilistic relationships perspicuous, overcomes 
the assumption of conditional independence, 
and enables the attendant probabilities to be 
learned from analysis of appropriate data 
sets (for example, EHR data). The approach 
has been demonstrated in numerous diagnos- 
tic systems, from belief networks that ascer- 
tain the status of newborns from data in the 
neonatal ICU (Saria et al. 2010), systems 
for differential diagnosis (Shwe et al. 1991; 
Middleton et al. 1991), to belief networks that 
offer interpretations of biomedical image data 
(Kahn et al. 1997). 

Because making most decisions in medi- 
cine requires weighing the costs and benefits 
of actions that could be taken in diagnosing 
or managing a patient’s illness, researchers 
also have developed tools that draw on the 
methods of decision analysis. Decision anal- 
ysis adds to Bayesian reasoning the idea of 
explicit decisions and of utilities associated 
with the various outcomes that could occur in 
response to those decisions (see > Chap. 3). 
One class of programs for decision-analysis is 
designed for use by the analysts themselves; 
such programs are of little use to the aver- 
age clinician or patient, however (Pauker and 
Kassirer 1981). A second class of programs 
uses decision-analysis concepts within sys- 
tems designed to advise physicians who are 
not trained in these techniques. In such pro- 
grams, the underlying decision models gener- 
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ally have been prespecified—either as decision 
trees that enumerate all possible decisions and 
all possible ramifications of those decisions or 
as belief networks in which explicit decision 
and utility nodes are added, called influence 
diagrams (Shachter 1986). 

We say that belief networks and influence 
diagrams represent knowledge in a declarative 
manner, because a belief network provides 
an inspectable, editable model of the proba- 
bilistic relationships that are relevant to the 
decision problem under consideration. If a 
developer wants to designate a new relation- 
ship between two entities in the network, then 
she needs only to augment the model by add- 
ing a new edge to the graph that encodes the 
given network. The network thus provides a 
transparent mechanism to communicate what 
the system “knows” about the probabilistic 
relationships among the entities in the appli- 
cation domain, and changing the network 
intrinsically changes the behavior of the sys- 
tem when it reasons about those entities. This 
is different from the case of a hardcoded algo- 
rithm or a system based on machine learning, 
where the knowledge is not readily inspectable 
or editable in a direct manner. 


Rule-Based Approaches 


Although belief networks provide a conve- 
nient mechanism to encode knowledge about 
the world in a declarative fashion, they are 
only one of several alternative frameworks 
that may be used to drive CDS based on 
explicit models of the application area. Since 
the 1970s, workers in medical AI have been 
exploring the use of methods that emphasize 
the modeling of rules that describe conclusions 
that can be reached about the decision prob- 
lem and the variables that may predicate those 
conclusions. Often called knowledge-based 
systems, these programs reason about the 
clinical situation by examining a collection of 
rules of the form, “If some set of conditions 
is true, then conclude that something else is 
true” (@ Fig. 24.2). Although these rules 
may be created through machine-learning 
approaches, they often are built by manu- 
ally encoding relationships between clinical 
data and corresponding conclusions that are 
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Rule 507 


IF; 
1) The infection that requires therapy is meningitis 
2) Organisms were not seen on the stain of the culture 
3) The type of infection is bacterial 
4) The patient does not have a head injury defect, AND 
5) The age of the patient is between 15 years and 55 years 


THEN 
The organisms that might be causing the infection are 
Diplococcus-pneuominae and Neisseria-meningitidis 


O Fig. 24.2 A rule from a rule-based system. Rules are 
conditional statements that indicate what conclusions 
can be reached or actions taken if a specified set of con- 
ditions is found to be true. This rule, taken from the 
CDSS known as MYCIN, is able to conclude probable 
bacterial causes of infection if the five conditions in the 
premise are all found to be true for a specific patient 


offered by experts in the field or by examina- 
tion of evidence reported in the scientific lit- 
erature. When a knowledge-based system is 
encoded using rules, it is referred to as a rule- 
based system (Buchanan and Shortliffe 1984). 

Rule-based systems provide an impor- 
tant mechanism for developers to build CDS 
capabilities into modern information systems. 
From CDS systems that interpret ECG sig- 
nals to those that recommend guideline-based 
therapy, rules provide an extremely conve- 
nient means to encode the necessary knowl- 
edge. Rule-based systems require a formal 
language for encoding the rules, plus an inter- 
preter (sometimes called an inference engine) 
that operates on the rules to generate the nec- 
essary behavior. 

Perhaps the best-known rule-based CDSS 
is one that was never put into clinical use, but 
that has served as a prototype for the many 
rule-based systems that have followed. The 
program, known as MYCIN, combined a 
diagnostic component with an advisor compo- 
nent that suggested appropriate management 
of patients who have infections (Shortliffe 
1976). MYCIN’s developers believed that 
straightforward algorithms or probabilistic 
approaches were inadequate for this clinical 
problem in which the underlying knowledge 
was poorly understood and even the experts 
often disagreed about how best to manage 
specific patients, especially before definitive 
bacterial culture results became available. As 
a result, the researchers were drawn to the 


use of interacting rules to represent knowl- 
edge about organisms that might be causing 
a patient’s infection and the antibiotics that 
might be used to treat it. 

Knowledge of infectious diseases in 
MYCIN was represented as production rules 
(see B Fig. 24.2). A production rule is an 
IF-THEN conditional statement. The con- 
clusions drawn by one production rule may 
be used to satisfy the premises of other rules 
when a system of rules is used for reasoning 
by an inference engine. MYCIN’s power was 
derived from such rules in a variety of ways: 
= The MYCIN program determined which 

rules to use and how to chain them together 

to make decisions about a specific case. 

The MYCIN reasoning program used an 

approach called backward chaining; when- 

ever a rule was being considered and the 
system did not know whether the condi- 

tion on the left-hand side of the rule (i.e., 

the premise) was true, MYCIN would look 

backward to see whether the knowledge 
base contained any other rules that, when 
evaluated, could conclude information 
that might inform the evaluation of the 
current rule’s premise. (Nearly all contem- 
porary rule-based systems, on the other 
hand, use an inference method known as 
forward chaining: Whenever a production 
rule “fires” and the conclusion of that rule 
is proven to be true, the system looks for- 
ward for other rules in the rule base that 
use the concluded information in their 
premise, and that therefore also might be 
able to fire now that the new conclusion is 
known to be true.) 

= The rules often formed a coherent expla- 
nation of MYCIN’s reasoning—those that 
applied to the current decision were dis- 
played in response to a user’s questions 

(O Fig. 24.3). Although rules were stored 

in a machine-readable format, English 

translations could be displayed. 


By removing, altering, or adding rules, sys- 
tem developers could modify the program’s 
knowledge structures rapidly, without explic- 
itly reprogramming or restructuring other 
parts of the knowledge base. The rule base 
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In this series, how many of the cultures were positive for E.coli 


(ORGANISM-1)? 


“WHY 


[i.e. WHY is it important to determine the number of cultures which were 


positive for the organism?] 


815 


(typical system prompt) 


(request from user to determine motivation for question) 


(system’s rephrasing of the WHY 
question, and answer) 


[1.0] This will help to determine whether there is significant disease 


associated with this occurrence of the organism. 
It has already been established that 


[1.1] the culture was taken from a sterile source 


[1.2] the number of cultures in the series is greater than 2 


Therefore, if 


[1.3] the number of cultures which were positive for ORGANISM-1 


is greater than 2 


Then there is suggestive evidence (.5) that there is significant disease 
associated with this occurrence of the e.coli (ORGANISM-1). 


[RULEO49] 
**WHY 


(repeated WHY to determine relevance of previous explanation) 


[i.e. WHY is it important to determine if there is significant disease 


associated with this occurrence of the organism?] 


[2.0] This will help to determine a diagnosis for which the infection should 


be treated. 
It has already been established that 


[2.1] the infection is not one of: primary bacteremia, meningitis 
[2.2] for some current culture of the patient it is true that the site 


of the culture is blood 
Therefore, if 


[2.3] there is significant disease associated with this occurrence of 


the organism 
Then 


It is definite (1.0) that the diagnosis for which the infection should 


be treated is secondary-bacteremia 
[RULE103] 


O Fig. 24.3 Two examples of MYCIN’s explanation 
capabilities. User input is shown in boldface capital let- 
ters and follows the double asterisks. The system 


thus offered the kind of inspectability and 
editability that we discussed when considering 
belief-network representations of knowledge. 

The developers evaluated MYCIN’s per- 
formance on therapy selection for patients 
with blood-borne bacterial infections (Yu 
et al. 1979a), and for those with meningitis 
(Yu et al. 1979b). In the latter study, MYCIN 
gave advice that compared favorably with that 
offered by experts in infectious diseases— 
results that ushered in enormous excitement 
about the potential of rule-based systems to 
offer high-level clinical advice in real-world 
situations. 


expands each [“WHY”] question (enclosed in square 
brackets) to ensure that the user is aware of its interpre- 
tation of the query 


The developers of MYCIN had to con- 
struct their own syntax for encoding rules and 
had to program their own inference engine 
to evaluate the rules. However, now there 
are many open-source and proprietary rule 
engines that provide custom-tailored editors 
for writing rules and inference engines that 
can execute the rules at runtime. For exam- 
ple, JESS is a popular Java-based rule engine 
that can be licensed from Sandia National 
Laboratory and that currently is free for 
academic use. Drools is an open-source rule 
engine developed by the JBoss community 
that also has had substantial adoption. 
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Developers use JESS, Drools, and propri- 
etary rule engines to create CDS systems that 
contain multiple rules that, as with MYCIN, 
can chain together to generate conclusions 
based on a sequence of inference steps. 
Decision support sometimes requires multiple 
rules to execute at runtime, together generat- 
ing a final recommendation that derives from 
the consequences of the rules chaining off one 
another. 

In most installed information systems, 
however, rule-based decision support is much 
simpler and also more limited. Most deployed 
CDS systems have rules that generally do not 
chain together, but that are triggered indi- 
vidually, each time that either there is a rel- 
evant change to the data in a patient database 
that should generate an alert, there is a time- 
related event that should trigger a reminder, 
or an action is performed (e.g., by a user, when 
the action is established in the workflow as a 
triggering event). Each rule examines the state 
of the database and generates a correspond- 
ing action, alert, or a freminder that is usually 
sent to a particular clinician or to members 
of the health-care team. Such rules are of the 
general form: Event — Condition — Action 
(ON Event, IF Condition, THEN Perform 
Action) and they are commonly referred to as 
ECA rules. 

For example, Arden Syntax became an 
international standard for ECA rules known 
as Medical Logic Modules (MLMs), endorsed 
by HL7 and ANSI in 1999 (@ Fig. 24.4). 
Arden Syntax provides a standard mecha- 
nism for declaring the variables about whose 
values the system will perform its reasoning 
(values that derive from data in the clinical 
information system); the conditions that, if 
true, would predicate specific actions; the 
actions that should be taken, and the kinds of 
events that would invoke or trigger the rule. 
The standard was created with the hope that 
the informatics community would develop 
whole libraries of MLMs, all written in Arden 
Syntax, that could operate in any clinical envi- 
ronment where an information system could 
interpret the standard format. 

A significant obstacle to the sharing of 
MLMs, however, is that Arden Syntax is, in 
fact, just a syntax. What is missing from the 


standard is any notion of the semantics of the 
data on which the MLMs operate. When an 
MLM executes, the variables that are used 
in the logic of the rule are bound to values 
that derive from the patient database of the 
information system in which the MLMs oper- 
ate. Arden Syntax specifies that the individual 
database queries needed to determine the val- 
ues of the variables should appear within the 
“curly braces” of variable definitions in the 
portion of the MLM known as the “data slot” 
(see @ Fig. 24.4). What a developer should 
include within the curly braces depends on 
the particular schema of the relevant patient 
database and mechanism for performing que- 
ries. EHR information models and the way in 
which elements are coded differ from system 
to system. Thus, all system-specific aspects of 
MLM integration need to be provided within 
the curly braces. To adapt an MLM for use 
in a new environment, a programmer needs 
to consider the variables on which the MLM 
operates, determine whether those variables 
have counterparts in the local patient data- 
base, and write an appropriate query that will 
execute at runtime. 

The curly braces problem is compounded 
because there may be assumptions regarding 
the semantics of the variables themselves that 
may not be obvious to the local implementer: 
If the MLM refers to serum potassium, should 
the logic be executed if the original specimen 
was grossly hemolyzed?° If a serum potas- 
sium value is not available in the database, but 
there is a value for a whole-blood potassium, 
should the MLM be executed using that value 
instead?° If there is no serum potassium value 
available for today, but there is one from last 
night, should the logic execute using the most 
recent value? Decision rules cannot simply be 
dropped from one system into another and 
be shared effortlessly; rather, considerable 
thought, analysis, and computer skill needs to 
go into writing the appropriate database que- 


5 If the red blood cells in a specimen hemolyze (burst), 
they release potassium, which can cause an inaccu- 
rate elevation in the measured potassium value. 

6 The serum is the liquid that is left when the cells are 
removed from whole blood. 
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MAINTENANCE: 

Title: Diabetic Foot Exam Reminder;; 

Mimname: Diabetic_Foot_Exam.mlm;; 

Arden: Version 2.8;; 

Version: 1.00;; 

Institution: Intermountain Healthcare ;; 

Author: Peter Haug (Peter.Haug@imail.org) ;; 

Specialist: Peter Haug (Peter.Haug@imail.org) ;; 

Date: 2011-11-28;; 

Validation: testing;; 

LIBRARY: 

Purpose: Alert for Diabetic Foot Exam Yearly;; 

Explanation: This MLM will send an alert if the patient is a diabetic (diabetes in problem list or discharge diagnoses) 
and Foot Exam is recorded within the last 12 months.;; 

Keywords: diabetes; Foot Exam;; 

Citations: Boulton AJM, Armstrong DG, Albert SF, Frykberg RG,Richard Hellman, Kirkman MS, Lavery LA, 
LeMaster JW, Mills JL, Mueller MJ, Sheehan P Dane K.Wukich DK. Comprehensive Foot Examination 
and Risk Assessment. Diabetes Care. 2008 August; 31(8): 1679-1685.;; 

Links: http://en.wikipedia.org/wiki/Diabetic_foot_ulcer;; 

KNOWLEDGE: 

Type: data_driven;; 

Data: Problem_List_Problem := object [Problem, Recorder]; 
Problem_List := read as Problem_List_Problem {select problem, recorded_by from Problem_List_Table}; 
Patient_Dx_Object := object [Dx]; 
Diabetic_Dx := read as Patient_Dx_Object {ICD_Discharge_Diagnoses}; 
Foot_Examination := object [Recorder, Observation]; 
Observation := object [Abnormatlity, Location, Size, Units]; 
Foot_Exam := read as Foot_Examination latest {select Recorder, Observation.Abnormatlity, 
Observation.Location, Observation.Size, Observation.Units from PE_Table}; 
Registration_Event := event { registration of patient }; 
ICD_for_Diabetes := (250 , 250.0, 250.1, 250.2 , 250.3 , 250.4, 250.5 , 250.6 , 250.7, 
250.8 , 250.9 );;; 

Evoke: Registration_Event;; 

Logic: if (Diabetic_Dx.Dx is in ICD_for_Diabetes or (exist Problem_List and "Diabetes" is in 
Problem_List.Problem)) then Diabetes_Present := true ; 
endif; 
if (Diabetes_Present and exist Foot_Exam and Foot_Exam occurred not within past 12 months) then 
conclude true; 
endif; 
conclude false ; = 

Action: 


O Fig. 24.4 This medical logic module (MLM), writ- 
ten in the Arden syntax, prints a warning for health-care 
workers whenever a patient who has diabetes is regis- 
tered for a clinic visit and has not had a documented 
foot examination in the past year. The evoke slot defines 
a situation that causes the rule to be triggered; the logic 
slot encodes the decision logic of the rule; the action slot 


write "Patient is a diabetic with no Diabetic Foot Exam in last 12 months. Please order or perform one.";; 


defines the procedure to follow if the logic slot reaches a 
positive conclusion. The data slot defines the variables 
that are to be used by the MLM; the text between curly 
braces must be translated into queries on the local 
patient database when the MLM is deployed locally. 
(Courtesy of P. J. Haug, Intermountain Healthcare, 
with permission) 
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ries that go within the curly braces to make 
such rules operational. 

In the case of Arden Syntax, developers 
write rules to deal with one clinical problem 
at a time. There may be one MLM to deal 
with the problem of administering a drug 
like penicillin to a patient with a history of 
penicillin allergy; another MLM may report 
that a patient has a dangerously low serum 
potassium value. Unlike the rules in MYCIN, 
MLMs are generally not intended to interact 
with one another or to be chained together 
to generate complex inferences. MLMs may 
be coerced to chain together when one MLM 
posts to the patient database a value that can 
trigger another MLM. This mechanism also 
allows one MLM to set up information in the 
database that might invoke another MLM in 
the case of some future event, thus enabling 
the recommendation of actions that unfold 
over time, as in the case of many clinical prac- 
tice guidelines for chronic diseases. Although 
this approach allows developers to program 
complex problem-solving behavior, the tech- 
nique has the same disadvantages that came 
to light with chaining rule-based systems such 
as MYCIN: When the rule base grows to a 
large size, interactions among rules may have 
unanticipated side effects. Furthermore, when 
rules are added to or deleted from a previously 
debugged knowledge base, there may be unex- 
pected system behaviors that emerge as a result 
(Clancey 1983; Heckerman and Horvitz 1986). 

For MLMs to work well in practice, more- 
over, the rules need to be tailored to the par- 
ticular clinical environment— triggered by 
appropriate workflow events, interacting with 
particular kinds of participants, customiz- 
ing logic to account for various business and 
workflow processes, and notifying the user in 
setting-specific ways. To customize an MLM 
to account for such considerations requires 
that it become less portable. Much of the 
effort required to introduce CDS systems 
into the health-care enterprise involves pre- 
cisely such adaptations. To accelerate porta- 
bility, MLM developers must seek a balance 
between a generic specification of logic that 
is widely agreed upon, and site-specific cus- 
tomizations that will facilitate the use of that 


logic. Achieving the right balance will always 
remain an elusive target (see also » Sect. 
24.5.4). 


More General Representations 

of Knowledge 

The variety of approaches for building CDS 
systems that we have described so far suggests 
that each method has significant strengths 
and weaknesses. Hardcoded branching-logic 
systems can be very easy to build but difficult 
to update and maintain when the program 
code becomes complicated. Belief networks 
and influence diagrams offer precision in 
probabilistic reasoning when the goal is to 
make a classification of some clinical phe- 
nomena, but they offer limited capabilities if 
the goal is to generate a plan for medical ther- 
apy on the fly or to simulate some biomedi- 
cal situation. Rule-based systems can help to 
decompose decision problems into tractable 
IF-THEN chunks, but they typically are lim- 
ited in what these chunks can express about a 
clinical problem. All of these approaches are 
constrained in their inability to reason about 
clinical abstractions (e.g., knowing that ele- 
vated serum potassium is a kind of electrolyte 
abnormality) and by their inability to make 
inferences about situations that entail nuance 
or complexity. 

In recent years, developers of CDS sys- 
tems have become increasingly interested in 
the use of more general representations of 
clinical knowledge to provide more sophisti- 
cated capabilities for decision support. These 
interests have paralleled the surge of enthusi- 
asm in the World Wide Web community for 
the notion of developing knowledge graphs 
that encode facts about the world in a lattice 
of nodes that represent the kinds of entities 
in the world and links between them that rep- 
resent relationships among those entity types 
(Noy et al. 2019). Unlike belief networks, 
which typically contain dozens of nodes, 
knowledge graphs often contain hundreds 
or thousands or even millions of nodes—or 
more. Whereas belief networks store numeric, 
probabilistic relationships among the entities 
in the graph, knowledge graphs store sym- 
bolic, logical connections among the enti- 
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Fig. 24.5 Google depicts its knowledge graph artis- 
tically as a vast collection of nodes representing entities 
in the world, linked to other entities in a rich network. 
The graph allows Google’s search engine to highlight 
attributes of entities for which users might search, to 


ties in addition to various properties of each 
entity. There currently is no standard way of 
creating a knowledge graph and there is not 
even consensus on what features a knowledge 
graph should support. Nevertheless, virtu- 
ally all well-known e-commerce sites—from 
Google to Facebook to Bing to eBay—use a 
graph-based representation of knowledge to 
provide key functionality for users hoping to 
perform a variety of tasks. 

Google, for example, has a knowledge 
graph that comprises more than one billion 
classes of entities and instances of those enti- 
ties, and more than 70 billion facts about those 
entries (@ Fig. ) (Noy et al. ). The 
graph makes it possible for Google to know 
that, when a user searches for some abstrac- 
tion, such as “medications for diabetes,” the 
user may be interested in different kinds of 
medications for diabetes, such as insulin and 
hypoglycemic drugs. Thus, the knowledge 
graph causes the search engine to bring up 
links to related entities when a user performs 
a search, and the graph helps to disambigu- 
ate the user’s query when the objective of the 
search may not be clear. 
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Date of birth: April 15, 1452 
Date of death: May 2, 1519 
(age 67 years) 


expand searches by including synonyms, and to collect 
information about entities related to the subject of a 
search for presentation to the user. (Source: 


) 


Many commercial CDS systems are 
adopting knowledge-graph technology as a 
component of their software architecture. 
Because there is no standard knowledge- 
graph approach, these systems adopt different 
knowledge-graph formalisms and perform 
different tasks using their graphs. “IBM 
Watson,” for example, is a name that encom- 
passes a family of CDS systems that have 
been deployed for a range of clinical domains 
in recent years. Although these IBM Watson 
systems have had a variety of capabilities and 
evolving computational architectures, a com- 
mon theme has been their use of knowledge 
graphs to implement precise information 
retrieval and to assist with natural language 
understanding. Thus, a knowledge graph can 
allow an IBM Watson system to interpret a 
user’s natural language query and to locate 
a document or a fact that is itself stored in 
a knowledge graph to respond to that query. 
Indeed, the use of knowledge graphs in gen- 
eral search engines such as Google and Bing 
provides such technologies with capabili- 
ties that make them useful for decision sup- 
port. When reporting search results, both 
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Google and Bing display the contents of their 
knowledge graphs for the indicated item in 
a “knowledge box” at the upper right of the 
page. These knowledge boxes often can be 
very effective in offering specific information 
to address user queries. 

The simplest form of knowledge graph 
is one that encodes an enumeration of the 
kinds of entities in an application area and 
that places them in an abstraction hierarchy, 
indicating when elements of one entity form 
a subclass (or superclass) of another. This 
type of data structure is often referred to as 
an ontology. Ontologies are like controlled 
terminologies (see >» Chap. 7), in that they 
provide a standard means for referring to the 
types of entities that comprise a domain, but 
they organize those entities using a graph that 
makes explicit the semantics of the relation- 
ships among the entities. SNOMED CT and 
the NCI Thesaurus are examples of com- 
monly used controlled terminologies that are 
represented using knowledge graphs in a man- 
ner that also makes them ontologies. 

Ontologies are important in biomedical 
informatics for their explicit representation 
of abstraction relationships, thus facilitating 
interpretation of high-throughput experi- 
ments, aiding information retrieval and 
natural language processing, and index- 
ing data (Bodenreider and Stevens 2006). 
CDS systems can use ontologies that encode 
knowledge about different application areas 
in a manner that aids reuse of that knowl- 
edge in new settings and that makes it easy for 
developers to update the knowledge as their 
understanding of the application area evolves. 
Such systems use ontologies (or knowledge 
graphs more generally) to encode clinical 
knowledge in a manner that may overcome 
some of the limitations of the more prevalent 
CDS architectures. These systems make an 
explicit distinction between the static knowl- 
edge of the clinical domain (e.g., knowledge 
of the specifications entailed by a clinical 
practice guideline) and the problem-solving 
knowledge needed to apply the static knowl- 
edge to a particular patient (e.g., the means to 
generate specific prescriptions for medications 
based on the general guideline recommen- 
dations and the particular clinical situation 


that the patient is experiencing). This distinc- 
tion makes it possible for system builders to 
address different elements of the knowledge 
needed to be represented in the computer 
using tailored approaches and tools (Musen 
1998; deClerq et al. 2004). 

The ATHENA-CDS system exemplifies 
this component-oriented approach (Goldstein 
et al. 2000). ATHENA-CDS is a computer 
system that is integrated with the HIS that has 
been used by the U.S. Department of Veterans 
Affairs (VA), known as VistA.” ATHENA- 
CDS has been installed at several VA medical 
centers and has remained in continuous use 
at the Palo Alto VA medical center since the 
1990s. ATHENA-CDS offers advice regard- 
ing patients who have certain chronic dis- 
eases, whose physicians would like to treat 
those patients in accordance with recognized 
evidence-based clinical practice guidelines 
(0 Fig. 24.6). ATHENA-CDS draws on sev- 
eral electronic knowledge bases, each one con- 
stituting a knowledge graph that encodes the 
knowledge of a particular guideline (e.g., for 
hypertension, for hyperlipidemia, for diabetes, 
and so on). Each time that a patient with a rel- 
evant diagnosis (e.g., hypertension) is seen in 
the outpatient clinic, ATHENA-CDS takes as 
input the corresponding guideline knowledge 
base and patient-specific data from the VistA 
EHR and generates as output suggestions to 
the clinician for treating the patient to ensure 
that the treatment is consistent with the care 
that the guideline would recommend. Because 
the standard documents that define clinical 
practice guidelines can be long and compli- 
cated, it is extremely helpful for the computer 
to focus the clinician’s attention on precisely 
which interventions should be considered to 
guarantee that the patient’s care is consonant 
with the medical evidence captured by a given 
guideline (@ Fig. 24.7). 

ATHENA-CDS was engineered using 
an approach that separates out static knowl- 
edge about the clinical application area from 
knowledge about problem solving (i.e., knowl- 
edge about generating a situation-specific clin- 


7 The VA is in the process of replacing VistA with a 
commercial HIS offered by Cerner. 
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O Fig. 24.6 An example of the ATHENA-CDS sys- 
tem interface. ATHENA-CDS provides decision-sup- 
port for the management of hypertension and several 
other chronic diseases by using a declarative knowledge 
base created as an instantiation on a generic guideline 
ontology. In the screen capture, the provider has entered 


ical recommendation; Musen et al. 1996). To 
construct ATHENA-CDS, it was necessary 
first to define an ontology of clinical practice 
guidelines (@ Fig. 24.8). The guideline ontol- 
ogy makes it clear that all guidelines must 
include eligibility criteria that indicate which 
patients should be treated in accordance with 
the guideline, a clinical algorithm that specifies 
the sequence of treatments recommended by 
the guideline, and guideline drugs that repre- 
sent all the medications that patients might be 
given when their provider follows the guide- 
line. Because the guideline ontology is gen- 
eral, the graph does not contain information 
about any particular clinical algorithm, any 
particular eligibility criteria, and so on. The 
ontology merely states that all guidelines for 
management of chronic diseases have such 
characteristics. 


the patient’s most recent blood pressure, and is offered 
advice about possible alterations in therapy based on 
the relevant clinical-practice guideline. The screen image 
depicts only simulated patient data (Courtesy of M. K. 
Goldstein, VA Palo Alto Healthcare System) 


Developers of ATHENA-CDS used 
the Protégé ontology-development system 
(Musen 2015) to create the ontology of clini- 
cal practice guidelines, which constitutes a 
knowledge graph. The developers then used 
Protégé to create subgraphs that represent dis- 
tinct knowledge bases that define how to man- 
age patients in accordance with particular 
guidelines. The developers created a knowl- 
edge base for management of hypertension 
reflecting the guideline that is used by the VA 
and the Department of Defense (DOD), sup- 
plemented with recommendations from the 
Joint National Commission on Hypertension 
(National High Blood Pressure Education 
Program 2004; @ Fig. 24.9). They instanti- 
ated the ATHENA-CDS guideline ontology 
to build a knowledge base for management of 
congestive heart failure based on the guideline 
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O Fig. 24.7 Professional societies, health-care prac- 
tices, private foundations, and other organizations are all 
working to capture “best practices” for managing 
patients in accordance with scientific evidence in terms 
of clinical practice guidelines. Unfortunately, nearly all 
these guidelines are published initially as large paper 


developed by the American Heart Association 
and the American College of Cardiology. 
The developers built a knowledge base for 
management of chronic pain, based on the 
guideline promoted by the VA and the DOD 
(Trafton et al. 2010). Other knowledge bases 
for guideline-based care of diabetes, hyperlip- 
idemia, and chronic kidney disease were cre- 
ated in a similar manner. 

The ontology-driven approach makes it 
possible to start with a particular ontology (in 
this case, one for clinical practice guidelines 
for management of chronic disease) to create 
multiple knowledge bases, each one instanti- 


documents. Here is a high-level, paper-based flowchart 
from the guideline developed by Joint National Commis- 
sion on Hypertension. The flowchart summarizes 
detailed recommendations that the guideline document 
specifies in many pages of text 


ating the ontology to specify the knowledge 
required for particular guidelines. Similarly, 
the different knowledge bases can be mapped 
to different problem-solving programs, such 
that each problem solver automates a differ- 
ent task associated with guideline-based care 
(therapy planning, eligibility determination, 
and so on). The ability to “mix and match” 
knowledge bases and problem solvers offers 
considerable flexibility, and it enables devel- 
opers to reuse elements of previous solutions 
to address new CDS problems that require 
different domain knowledge or different 
problem-solving procedures. 
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| % Knowledge Tree | ® Classes | = Forms | @ instances | 


O Fig.24.8 A small portion of the ontology of clinical 
guidelines used by ATHENA-CDS as entered into the 
Protégé ontology-editing system. The hierarchy of 
entries on the left includes entities that constitute build- 
ing blocks for constructing guideline descriptions. The 
panel on the right shows the attributes of whatever 
entity is highlighted on the left. Here, goal, eligibility_ 
criteria, and clinical_algorithm, for example, are attri- 


24.3.3 Coda 


There is no standard way to build a 
decision-support system. Developers need 
to make choices, based on the nature of the 
decision task to be performed, the data and 
knowledge that are available, and the soft- 
ware tools with which they are most famil- 
iar. There are a variety of approaches from 
which to choose, and each one entails differ- 
ent kinds of trade-offs. A CDSS for provid- 
ing advice about something potentially as 
complicated as a clinical practice guideline 
can be hardcoded in software, implemented 
as a rule-based system, or driven by knowl- 
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© Guideline_View = goal multiple Instance of Conditional_G 
> © Medical_Domein_Class label required +.. String 
= patient_characterization multiple Instance of Diagnostic_Te. 
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butes of the entity known as Mangement_Guideline. The 
ontology entered into Protégé reflects concepts believed 
to be common to all guidelines, but does not include 
specifications for any guidelines in particular. The com- 
plete domain model is used to generate automatically a 
graphical knowledge-acquisition tool, such as the one 
shown in @ Fig. 24.7 


edge-graph technology. In cases when there 
is no appropriate evidence-based guideline, 
a CDSS can use probabilistic approaches 
or machine learning to suggest reasonable 
treatment decisions. Although interoper- 
ability standards such as SMART-on-FHIR 
are offering the opportunity to embed cus- 
tom-tailored CDS technology within propri- 
etary Health IT systems (see > Chap. 7), the 
monolithic nature of most installed systems 
makes it challenging to achieve this kind of 
flexibility in the real world. Nevertheless, 
new IT standards are improving the land- 
scape considerably, as we discuss in the next 
section. 
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G Fig. 24.9 A screen from a Protege-generated 
knowledge-acquisition tool for entry of clinical-practice 
guidelines. The tool is generated automatically from a 
domain ontology, part of which appears in © Fig. 24.7. 


24.4 Translating CDS to the Clinical 
Enterprise 


Over the past several decades, advanced CDS 
systems have been developed and deployed in 
a number of academic medical centers. The 
technology has subsequently diffused into 
commercial EHR systems and into routine 
practice (Chaudhry et al. 2006). The uptake 
has been greater in medium-to-large hospitals 
and in medical-center—based networks, includ- 
ing affiliated practices, and has been much less 
in smaller hospitals, clinics, and independent 
practices. Although these trends had been 
sluggish, the “meaningful use” regulations 
for HIT and other regulatory and incentive- 
based approaches in the United States as well 
as elsewhere accelerated the adoption of CDS 


The entries into the tool specify the knowledge required 
to treat patients in accordance with the guideline for 
chronic hypertension adopted by the Department of 
Veterans Affairs 


technology (Blumenthal and Tavenner 2010; 
Blumenthal 2010). 

In general, the CDS systems deployed to 
date in vendor EHR systems are quite var- 
ied, and relatively limited in scope, and their 
capabilities in knowledge management are 
also varied (Wright et al. 2009). The greatest 
uptake has been in the form of simple alerts 
and reminders, standard physician order sets, 
CPOE-based prescription templates with dose 
checks, allergy checks, identification of drug— 
lab and drug—drug interactions, and some use 
of info-buttons or access to context-specific 
knowledge resources. In some specific settings, 
rule-based systems have been used to drive the 
intelligent collection of clinical information 
in a comprehensive, structured clinical docu- 
mentation form (Schnipper et al. 2008). 
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A few vendors have been successful at 
distributing knowledge resources, making 
available shareable clinical knowledge in the 
form of drug-interaction databases, order 
sets for common indications, rule-based 
knowledge, documentation templates, and 
information resources for infobutton-based 
queries (Middleton et al. 1998). More recent 
research has examined opportunities for cre- 
ating knowledge repositories to make these 
resources more readily available in both the 
public and private sector (Osheroff et al. 2007; 
Kawamoto et al. 2013). 

Despite growing demand, many years of 
research and development, and the broad 
adoption of EHRs in recent years, CDS has 
had relatively limited adoption to date. There 
are several reasons for the slow uptake of 
CDS (Wright et al. 2009), which need to be 
addressed by approaches such as those enu- 
merated in the sections that follow. 


24.4.1 Standard Patient 
Information Model 
CDS rules and other problem-solving 


approaches need to operate on specific patient 
data with a clear understanding of the patient 
data model and semantics of the terms. If 
those data are stored in a proprietary format 
and with non-standard encodings, then a set 
of rules needs to be customized to use data in 
that form, or the data need to be translated to 
the information model of the CDS rules, qual- 
ity measure, or other uses. The customization 
of rules for each EHR (or EHR implantation) 
has been the prevailing mode, typified by 
the “curly braces problem” of Arden Syntax 
rules, described previously. As a result, vendor 
EHR systems tend to have libraries of rules 
that operate only in their own systems, using 
their proprietary data dictionaries and data 
models. Knowledge sharing across platforms 
and systems has been limited, and consider- 
able work is required to integrate vendor HIT 
products with external CDS systems. 

One approach is to develop a canonical 
information model that both the knowledge 
source systems, and the knowledge artifacts 
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can use. This strategy can facilitate a com- 
mon understanding of semantics in the data 
and the semantics of the knowledge artifact, 
and it alleviates mapping CDSS variables to 
local data or developing a set of custom rules 
to access the local data in each and every set- 
ting. This approach does not eliminate what 
some describe as an “irreducible mapping 
problem” in Health IT, but it does distribute 
it in an effective manner such that, as data 
representation evolves more toward a canoni- 
cal form, we can approach iso-semantic map- 
pings of data from data source to knowledge 
artifact—that is, where the semantics of the 
source data are identical to the semantics of 
the data expected in the CDS analytic. This 
approach has been pursued in the very suc- 
cessful Observational Health Data Sciences 
and Informatics (OHDSI) project in research 
informatics (Hripscak et al. 2015), which 
adopts a Common Data Model for source 
systems to map to, and for analytics (and 
federated queries) to run against (Jiang et al. 
2017a, b). 

As discussed below in > Sect. 26.4.2, in 
2012, a virtual medical record (VMR) based 
on the HL7 version 3.0 Reference Information 
Model (RIM) was an initial effort to arrive at 
the notion of a canonical information model. 
It was approved by HL7 as a draft standard 
for trial use for linking dynamically at runtime 
the arbitrary data elements available in the 
patient database of an EHR to CDS systems 
that assume the standard vMR data model for 
data encoding (Kawamoto et al. 2010). The 
vMR was designed to serve as an intermedi- 
ary data model—a canonical form—between 
proprietary database formats and standards- 
based CDS systems that developers might 
plug into any EHR that can make its data 
available in a VMR-compliant manner. HL7 
supported work to map the vMR to standard 
terminologies and clinical data element defi- 
nitions. So, for example, data elements in an 
ECA rule could be referred to in an interoper- 
able manner, rather than, say, the individual 
mapping of data elements to access meth- 
ods inside the curly braces of Arden Syntax 
MLMs (see Rule-Based Systems in > Sect. 
26.3.2). 
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Newer work focuses on the develop- 
ment of the Quality Data Model (QDM) to 
represent clinical data and concepts used in 
specifying quality measures, and ultimately 
clinical decision support logic. The QDM is 
an information model that describes the rela- 
tionships between patients and clinical con- 
cepts in standardized formats. QDM allows 
the definition of a clinical concept used in a 
specification to measure quality of patient 
care via defined data elements, and it provides 
the vocabulary needed to relate concepts to 
each other. For example, QDM, at the highest 
level of abstraction, defines categories such 
as Medication, Procedure, Condition, and so 
on. Within a category, the Datatype definition 
gives the context of the clinical care process 
being assessed such as “medication - active,” 
or “medication — administered.” Further 
Datatype details are defined with Attributes 
that can further define the Datatype, or define 
the expected source for the data. For example, 
a diagnosis of “Diabetes Mellitus Type I” is 
an active diagnosis datatype, a “Metformin 
prescription” is a medication prescribed, and 
a “Hemoglobin Alc value” is a laboratory 
result. Each of these elements can be related to 
terms in controlled terminologies via specifi- 
cation of corresponding code sets. By relating 
attributes between data elements, the QDM 
provides a method to construct complex clini- 
cal representations both for electronic clinical 
quality measures and for clinical decision- 
support logic. Because of this ability to share 
common logic elements (expressions, value 
sets, terminology) between quality measure 
and CDS specifications, QDM has become 
popular for specifying quality measures 
among a wide variety of measure developers 
and CDS implementers (Pathak et al. 2013; 
Hong et al. 2016). 

More recently, the Fast Healthcare 
Interoperability Resource (FHIR) standard 
from HL7 has emerged from a multi-ven- 
dor collaboration in the Argonaut Project 
(HealthMst 2015). FHIR is designed specifi- 
cally for the Web and provides resources and 
foundations based on common methods and 
technologies used in the Web (XML, JSON, 
HTTP, and OAuth) (Bender and Sartipi 


2013). FHIR resources are accessed through 
what is known as a RESTful API, which can 
be defined as one that uses HTTP requests to 
GET, PUT, POST and DELETE data. FHIR 
frameworks are built around the concept of 
resources—basic units of interoperability 
and modular components that can be assem- 
bled into working systems to try to resolve 
clinical, administrative and infrastructural 
problems in health care. This capability has 
partially addressed the heterogeneity of 
different systems, and it has dramatically 
improved interoperability between systems. 


24.4.2 Adoption of Standard 
Knowledge-Representation 
Models 


Although Arden Syntax has been an HL7 and 
ANSI standard since 1999, only a few ven- 
dor systems manage their libraries of deci- 
sion rules using Arden Syntax. Even doing 
so, of course, the rules still need to be cus- 
tomized for use with vendor-specific patient 
databases on a tedious, rule-by-rule basis to 
overcome the curly-braces problem described 
in > Sect. 24.3.2. HL7 has pursued the devel- 
opment of more interoperable models not 
only for data query but also for defining ele- 
ments of ECA rules. A succession of efforts 
led to a specification for a query language 
called GELLO (Sordo et al. 2004), and then 
the Health eDecisions rule formalism,’ and 
more recently the Clinical Query Language, 
discussed below. 


24.4.2.1 Standards for Encoding 


Clinical Guideline Models 
Particularly challenging in the standards- 
development world is creation of a stan- 
dardized, shared model for representing 
clinical practice guidelines in a form suit- 
able for execution at run time. The Guideline 
Element Model (GEM; Shiffman et al. 2000) 


8 HL7 International (2014). Health eDecisions. 
Retrieval February 19, 2020: $ https://wiki.hl7.org/ 
index.php?title=Health_eDecisions 
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is an XML mark-up specification that is 
an American National Standards Institute 
(ANSI) standard, now in its third revision, 
that guideline authors can use to annotate 
their narrative guidelines to identify key ele- 
ments for both quality assessment and execu- 
tion. GEM allows authors to demarcate the 
text that identifies guideline actions or eligibil- 
ity criteria, and thus can serve an intermediary 
purpose in work to transform a prose guide- 
line into a computable specification, but the 
standard does not itself provide a mechanism 
to translate a marked-up guideline document 
into a structure that a computer can interpret 
and execute. Such systems may be criticized 
as not explicitly representing the underlying 
ontology of the guideline components to be 
used at run time. 

Other efforts have focused on creation of 
a guideline ontology, such as the one adopted 
by ATHENA-CDS (see @ Fig. 24.8), 
that can inform the creation of computer- 
understandable knowledge bases that are able 
to capture knowledge about specific guidelines. 
Such knowledge bases then could allow a CDS 
system to use knowledge about the guideline, 
data from the EHR, and information concern- 
ing patient preferences and available resources 
to offer situation-specific, guideline-directed 
advice. As we have noted, an underlying infra- 
structure known as EON (Musen et al. 1996) 
drives the ATHENA-CDS system. Other 
ontology-based approaches have appeared 
over the years, including GLIF, GUIDE, 
PRODIGY, Asbru, and GLARE. Peleg and 
colleagues (2003) compared many of these 
guideline models, and they showed signifi- 
cant commonalities among them. Despite the 
large degree of agreement, however, work in 
this area has not yet led to anything near a 
standard that is widely adopted. Part of the 
problem is that there is wide variation in the 
structure, granularity, and specificity of exist- 
ing clinical practice guidelines, making it diffi- 
cult to develop a single comprehensive and yet 
readily applicable guideline model. Analysis 
of the use of guidelines also indicates that 
guidelines themselves are rarely “executed” 
without considerable adaptation or localiza- 
tion, except in situations such as protocol- 
driven care (for example, in clinical trials 
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or in very specific procedures such as renal 
dialysis) ATHENA-CDS thus dispenses 
with offering specific guideline-based recom- 
mendations, and instead suggests to the clini- 
cians when certain treatment options might be 
“compellingly indicated” or “relatively con- 
traindicated.” In highly regimented settings 
such as the administration of chemotherapy 
for cancer, however, a CDS system generally 
would need to be much more “prescriptive” in 
offering recommendations to clinicians. 


24.4.2.2 Standards for Encoding 
ECA Rules 


As we have noted, the ability to share decision 
rules is hindered in the absence of standards 
to encode both the logic of the rules (how 
the IF and the THEN components are to be 
evaluated) and the clinical data on which the 
rules depend. Recently, the Clinical Quality 
Language (CQL) has emerged as an expres- 
sion language that addresses these challenges. 
CQL is intended to characterize both quality- 
measure logic and decision-support logic and 
the data that such logic processes (Odigie 
et al. 2019). For example, CQL expressions 
define the expected input data model, library 
resources that may be called, parameters, 
value sets, and code sets used in definitions of 
patient conditions, and additional concepts 
needed to specify and encode the clinical logic 
and data types used in a clinical quality mea- 
sure—or in a CDS rule (Jiang et al. 2015). 
Expression languages are commonly used 
to represent the logic to be used at the presen- 
tation layer of an application, and the meth- 
ods that may be used to interact with standard 
data models. CQL grew out of prior efforts 
such as GELLO and the Health eDecisions 
framework, from work in the U.S. Office of 
the National Coordinator’s Clinical Quality 
Framework Initiative. It was focused on iden- 
tifying, defining, and harmonizing standards 
and specifications that promote integration 
and reuse between clinical decision support 
and clinical quality measurement (CQM) 
knowledge-representation formalisms. It 
strives to be more clinician friendly and acces- 
sible to subject-matter experts. However, the 
standards used for the electronic representa- 
tion of CDS and CQM artifacts have not been 
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developed in consideration of each other, and 
the domains use different approaches to the 
representation of patient data and comput- 
able expression logic. Harmonization of these 
approaches now is focused on clearly identi- 
fying the various components involved in the 
specification of quality artifacts, and then 
establishing as a principle the notion that they 
should be treated independently (separation 
of concerns). Broadly, the components of a 
CQL artifact involve specifying: 
= Metadata — Information about the knowl- 
edge artifact (whether CQM or CDS) such 
as its identifier and version, what health 
topics it covers, supporting evidence, 
related artifacts, dependencies, etc. 
= Clinical Quality Information — The struc- 
ture and content of the clinical data (data 
model) involved in the artifact 
= Expression Logic — The actual knowledge 
and reasoning being communicated by the 
artifact 


The CQL specification is an approved HL7 
standard endorsed by the U.S. Centers 
for Medicare and Medicaid Services, the 
U.S. Centers for Disease Control and 
Prevention, and others for the representation 
of either CDS or CQM knowledge artifacts. 
It is part of the Clinical Quality Framework 
effort supported by the Standards and 
Interoperability Framework of the U.S. Office 
of the National Coordinator for Health 
IT. As such, it aims to promote the interop- 
erability of knowledge artifacts through 
standardization of the data models, logic 
statements, and controlled medical termi- 
nology (including value sets). Because CQL 
is also fully specified in a machine-readable 
way, it may be programmatically converted 
into what HL7 calls an expression logical 
model (ELM), which can then be interpreted 
in an execution environment. 

The convergence of a common formalism 
that accommodates the use of standardized 
data models, facilitates clinical query and com- 
putation through a library of methods, and 
which is machine interpretable dramatically 
improves the portability of clinical knowl- 
edge artifacts. In addition, this specification 


is designed to be data model-independent, 
meaning that CQL has no explicit depen- 
dencies on any aspect of any particular data 
model. Rather, the specification allows for any 
data model to be used, so long as a suitable 
description of that data model is supplied. 

Current efforts now focus on further stan- 
dardization of input data models, as well as 
alignment with the FHIR API standard. 
For example, the QUICK specification and 
the Quality Improvement Core (QICore) are 
being developed concurrently with the CQL 
specification to ensure that the two specifi- 
cations interoperate effectively. QUICK is a 
logical model consisting of clinical objects, 
attributes, and relationships. QUICK pro- 
vides a uniform way for clinical decision 
support and quality measures to refer to clini- 
cal data. This initiative began in 2013 with 
the creation of the Quality Improvement 
Domain Analysis Model (QIDAM), which 
drew on the VMR and QDM as sources of 
requirements. Originally, the QUICK data 
model was developed entirely independently 
of FHIR. However, recognizing the broader 
community focus on FHIR, QUICK was 
aligned, structurally and semantically, as 
closely as possible to FHIR. This alignment 
not only creates a common model for qual- 
ity and interoperability, but also it will make 
it easier in the future to leverage other FHIR- 
related efforts, such as Clinical Document 
Architecture (CDA) on FHIR, or CQL on 
FHIR. Authors of future quality measures 
and clinical decision support artifacts may use 
QUICK, together with the Clinical Quality 
Language (CQL), to create interoperable and 
executable knowledge artifacts, thus dramati- 
cally the ability to share computable biomedi- 
cal knowledge artifacts. 


24.4.3 Modes of Deployment 
of CDS 


Even with the emergence of shareable, com- 
putable, biomedical knowledge artifacts, one 
of the key impediments to widespread adop- 
tion of CDS, particularly the use of rules 
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and alerts, is clinician annoyance with pop- 
ups, messages, emails, and other notifications 
that interrupt workflow. Ideally CDS systems 
should be integrated into the organization 
and presentation of information to facilitate 
workflow and decision making, by anticipat- 
ing what information is needed for a decision, 
pre-fetching it, displaying it in ways that sup- 
port visualization of trends or relationships, 
and tying these analyses to care plans or 
actions that can be offered immediately and 
quickly selected by the user. Order sets, as 
stated in the beginning of this chapter, form a 
good example of use of CDS both to suggest 
appropriate actions in a given setting and to 
make it easy to accomplish those actions, by 
immediately enabling the orders in the set to 
be entered automatically into the EHR, per- 
haps with modification. 

There is much ongoing research to develop 
methods for managing the processes of data 
capture, data presentation, data visualiza- 
tion, and selection of actions, but this work 
is usually being done outside of vendor 
EHRs. Given limited interoperability and 
access to the internals of proprietary sys- 
tems, this kind of experimentation is now 
tending to take place in the form of apps and 
services that operate on externally extracted 
data (Mandl and Kohane 2012). There is 
growing support for the notion of SMART 
apps—Substitutable Medical Applications 
and Reusable Technologies—that use stan- 
dard API methods to access electronic health 
record data, perform clinical inferences exter- 
nal to the EHR, and return insights in a stan- 
dardized container (whether iFrame, Web 
page, or free-standing application; Mandel 
et al. 2016). 


24.4.4 Workflow and Setting- 
Specific Factors 


As noted in » Sect. 24.3.2, applications 
based on single-step situation-action rules are 
among the most prevalent and useful types of 
CDS systems. Such systems can be invoked in 
many contexts to provide either recommenda- 
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tions in real time or reminders or alerts that 
are processed in batch, based on time-oriented 
triggers or data-evaluation events. Rules can 
invoke other knowledge resources—provid- 
ing new information content, triggering other 
rules, or offering order sets. 

Rule content is ideally based on analy- 
sis of clinical evidence, such as recommen- 
dations or guidelines emanating from the 
U.S. Preventive Services Task Force, or from 
professional-society studies of best practices 
for specific diseases. The job of formalizing 
these recommendations into executable logic 
requires that they be expressed in a specified 
way, but even having done so, such rules are 
not typically ready to execute in a particular 
environment, even if they are expressed in a 
rule execution language “understood” by an 
EHR system, and if they refer to the data ele- 
ments in the EHR in their expected format. 
The reason the rules are not readily execut- 
able is also the reason that rules that work 
well in one environment are often not able to 
be successfully deployed elsewhere without 
substantial modification (even if in the same 
representation format and if using the same 
data model). 

The reason for the failure is lack of adap- 
tation to what we refer to as setting-specific 
factors (SSFs; Greenes et al. 2010). To work 
effectively, rules need to integrate well with 
the clinical setting, workflow, users, applica- 
tion environment, and other factors. These 
requirements are reflected in how and when 
the rule should be triggered—on various 
events such as examination of some ele- 
ment of the EHR, on login to the system, or 
on the availability of laboratory test results. 
Rules may also be in developed in the form of 
reminders that are triggered when the CDSS 
evaluates on a batch basis a practice’s list of 
patients to be seen on a given day, the patients 
who have a birthday in a given month, the pas- 
sage of a specific interval of time since a previ- 
ous comparison event, and so on. The rules 
additionally may vary based on the practice 
setting (e.g., the emergency department, an 
office practice, or an inpatient unit); particu- 
lar inclusion or exclusion criteria or threshold 
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modifications that may be site-specific; how 
the recommendation should be transmitted 
(e.g., via electronic mail, popup windows, or 
sidebar messages); whether the recommenda- 
tion requires acknowledgment by the recipi- 
ent; whether it can be overridden; whether 
the alert should be escalated to supervising 
clinicians, and so on. Rules that have been 
custom tailored in such ways by means of exe- 
cutable code naturally are less sharable than 
are generic rules. Failure to capture the kinds 
of customizations that are needed, however, 
makes it time consuming for individual sites 
to adapt generic medical recommendations to 
their particular requirements or to capitalize 
on the experiences of others. What is needed 
is a way to represent useful experience in 
terms of SSF combinations that work, with- 
out needing to do so at the level of detailed 
code that is difficult for users to visualize and 
modify. 


24.4.5 Sharing of Best-Practice 
Knowledge for CDS 


The methods described above for sharing 
computable biomedical knowledge are now 
gaining momentum (viz., SMART, CQL, 
FHIR). Historically, it generally has fallen on 
each health-care organization, user group, or 
other entity to undertake its own process of 
identifying and managing the best-practice 
knowledge it wants to deploy in its CDS sys- 
tems. Most institutions lack the expertise, or 
the resources, to accomplish this task. Even 
having a national or international reposi- 
tory of such knowledge would not preclude 
the need for customization, but it would 
certainly make it easier for each health-care 
entity to start with a trusted source. Over the 
past decade, the U.S. Agency for Healthcare 
Quality and Research has funded efforts to 
create such a public repository, known as 
CDS Connect (Lomatan et al. 2019). In CDS 
Connect’s repository, the goal is to archive 
knowledge artifacts represented in the CQL 
formalism to promote sharing and interop- 
erability. Where such a repository should be 
hosted, how it might integrate public and pri- 


vate knowledge sources, who would have over- 
sight over it, how its knowledge would be peer 
reviewed and quality-rated, and how it would 
be sustained are among the many questions 
that have not yet been answered, but this is 
an area of intense research and development. 
While many health-care organizations con- 
tinue to perform this kind of knowledge-cura- 
tion work for their own constituencies, several 
initiatives show a clear pathway to becoming 
viable alternatives for knowledge aggregation 
and dissemination. 


24.5 Future Research 
and Development for CDS 


Workers in biomedical informatics have 
studied problems in assisting with complex 
decision making for more than half a cen- 
tury. It seems that it is only now, with the 
very recent adoption of HIT on a wide- 
spread basis, that the foundations are finally 
in place for the rapid advance of CDS tech- 
nology in clinical settings. Although con- 
siderable logistical problems still must be 
surmounted as outlined in > Sect. 24.4, this 
is an exciting time in which to study CDS 
and its translation from the laboratory to 
the point of care. 


Standards Harmonization 
for Knowledge Sharing 
and Implementation 


24.5.1 


Many implementation challenges remain 
for the broad adoption and effective use 
of CDS in EHR systems. As mentioned 
above, one of the most active areas of cur- 
rent research focuses on development of 
standard approaches to knowledge sharing 
for CDS. Knowledge sharing may take the 
form of human-readable artifacts, machine- 
interpretable artifacts, or executable Web 
services (Osheroff et al. 2007; Goldberg et al. 
2014; Dixon et al. 2013; Kawamoto et al. 
2013). A capability for CDS sharing, as well 
as CDS functionality itself, would be substan- 
tially facilitated by the continued develop- 


Clinical Decision-Support Systems 


ment and use of common standards designed 
to serve CDS needs in health care. 

As noted, several standards currently exist 
that are aimed at specific areas of CDS and 
types of CDS artifacts, or that could be lever- 
aged to benefit CDS. For example, the Clinical 
Decision Support Consortium, a large col- 
laborative research and development group 
supported by the U.S. Agency for Healthcare 
Research and Quality, adopted an enhanced 
version of the Continuity of Care Document 
(CCD) to serve as the foundation for input 
data for multi-institutional trials of CDS 
technology (Middleton 2009). When taking 
advantage of the most current standards such 
asCQL and FHIR, systems developers tend to 
adopt not only the standards but also arrive at 
more common implementation approaches or 
patterns, that promote knowledge sharing and 
reuse, and which may decrease the implemen- 
tation burden if EHR vendors accommodate 
the standardized API and interaction models, 
as well as the data model expectations of the 
standards. 


24.5.2 Context-based Knowledge 
Selection 


Much of the above effort is aimed at over- 
coming the non-portability of event- 
condition-action (ECA) rules, such as alerts 
and reminders, because of the need to local- 
ize and adapt triggering conditions, modes of 
interaction, and specific workflow processes 
of sites (Setting-specific factors, or SSFs, as 
discussed in > Sect. 24.4.4), leading to many 
variations that must be laboriously designed 
and tracked. 

One of the possible modes for reducing 
this need for customization and localization is 
to use the context, state, and activity (CSA) of 
a CDSS user to automatically identify appro- 
priate knowledge artifacts to be made avail- 
able. The idea relies on maintaining patient 
state, user role, expertise, setting, and specific 
tasks being performed to determine when the 
knowledge might be pertinent. The latter also 
requires a rich multi-axial set of metadata for 
the knowledge artifact repository that allow 
selection based on user CSA. 
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Other ways in which context could be used 
might be to create specialized views or filters 
of available data, based on CSA. For exam- 
ple, selection of data to be viewed could be 
based on a user’s role and domain expertise. 
Associations (and, ideally, typed relations) 
among data items could be highlighted— 
for example, among problems, findings, and 
actions. Such relations could also potentially 
be used to anticipate assessment and plan 
entries in notes based on data items pres- 
ent or being focused on. Such approaches 
undoubtedly will only be scalable in gener- 
alized web-services implementations. For 
example, infobutton managers already use 
context such as user type, app/function being 
performed, and specific item being evaluated 
to identify retrieval keys for external knowl- 
edge resources. The idea behind the expansion 
of context as a mode for invocation of CDS 
more generally is to create a more detailed 
and continually updated model of context, 
state, and activity. Such efforts are just getting 
underway. 


24.5.3 Representation Models 


To date, standards and related efforts address- 
ing CDS have heavily emphasized specific 
CDS execution methods and the represen- 
tation of the clinical context of the patient. 
For example, as we have noted previously, 
a variety of frameworks for working with 
rules, including Arden Syntax, Drools, JESS, 
CQL, along with several proprietary formats, 
have worked their way into vendor offerings. 
This diversity has inhibited the exchange of 
best-practice knowledge to date, but prog- 
ress toward more effective knowledge inter- 
change is being made with CQL in particular. 
The current situation is that both public and 
private knowledge repositories of knowledge 
artifacts that address specific pieces of the 
CDS problem exist (e.g., NLM Value sets, 
CMS eCQMs, standardized data models, 
professional society clinical pathways, vendor 
implementations of algorithms and analytics, 
and SMART applications) and we see grow- 
ing adoption given the increased pressures on 
clinical reasoning and operations in practice. 
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In addition to reducing unwarranted clini- 
cal variability in practice, future work needs 
to establish a canonical patient information 
model with a formal ontology, an event model 
for triggering conditions, an action model for 
CDS intervention recommendations, a work- 
flow model for appropriately inserting CDS 
interventions into the routines of clinical 
practice, a knowledge-representation schema 
with a standard regular expression language, 
and, ideally, a measurement standard to assess 
CDS performance in use. 


24.5.4 Externalizing CDS 


Standardization of methods for externalized 
CDS (CDS performed outside of the EHR, 
in the cloud) is becoming more commonplace 
due to emerging standards described above 
(CQL, CQM, QUICK, FHIR). Prior research 
and development efforts have demonstrated 
the feasibility of accessing clinical data from 
the electronic health record via standards- 
based (viz., FHIR) and proprietary RESTful 
APIs, and running executable knowledge arti- 
facts — both quality measures, and clinical 
decision support — on the data in the secure 
cloud environment and returning insights into 
the app being used in the clinical workflow 
(Wright et al. 2015; Dixon et al. 2013). Current 
work is focused on representing all of the req- 
uisite knowledge artifacts required to create 
a computable practice guideline fully repre- 
sented in CQL, and on running those knowl- 
edge artifacts via FHIR data-access methods. 
Use of services enables considerable flexibil- 
ity and breadth of kinds of CDS that can be 
made available widely, and ongoing work with 
FHIR resource specifications is improving the 
ability to gather the data needed and deliver 
it, and to integrate with clinical workflow pro- 
cesses more smoothly. 

We have described the many efforts 
underway to standardize data models for 
information exchange, CQL for knowledge 
representation, and functional integration 
with EHR workflows using SMART, and 
FHIR. A deeper functional integration model 
between externalized services and EHR work- 


flow is provided by CDS Hooks. CDS Hooks 
(Dolin et al. 2018) is a method for enabling 
an event in the host system to determine 
when an app (such as provided by SMART- 
on-FHIR) should be launched programmati- 
cally, thus complementing the need to launch 
apps directly by users. The EHR detects an 
event such as a physician beginning to write 
an order, and it then can invoke an external 
decision-support service. That service can 
determine what task is being performed and 
return information in the form of a “card” (a 
phrase or text snippet containing an inference 
or assessment, or suggested action) that will 
be displayed within the EHR. CDS Hooks 
may also provide a link to an external app. 
The main downside of these approaches to 
CDS is that, although they have formal meth- 
ods for launching apps, they basically have no 
constraints on what happens within the apps, 
or support for linking to the intrinsic func- 
tionality and workflows of a host EHR. Thus, 
the methods can result in a proliferation of 
SMART apps and CDS Hooks modules in 
an organization’s library, with a number of 
them that may be of limited utility. This situ- 
ation could cause significant challenges for an 
enterprise seeking to manage and update its 
CDS capabilities on a regular basis. 


24.5.5 Usability Research and CDS 


The use of CDS within EHRs, and health IT in 
general, have been identified as double-edged 
swords: technology may provide benefit, but it 
also may cause considerable harm. Clinician 
error when using information systems that 
may result in untoward outcomes and unin- 
tended consequences (Karsh et al. 2010; Sittig 
and Singh 2009) may be an emerging property 
that is demonstrated only after system imple- 
mentation or widespread use. Medical errors 
related to use of Health IT are problematic, 
not only for clinical and quality of care rea- 
sons, but technically, since they may represent 
a mismatch between the user’s model of the 
task being performed and the model used in 
a computation (National Research Council 
(U.S.) Committee on Engaging the Computer 
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Science Research Community in Health Care 
Informatics et al. 2009), a mismatch between 
the application’s intended functionality and 
the resulting action or event (Harrison et al. 
2007), or a latent health IT-related error yet 
to happen (Ash et al. 2007). Excessive alert 
fatigue can undermine the efficacy of clinical 
decision support in CPOE (Isaac et al. 2009; 
Strom et al. 2010), and in other IT functions 
(Chused et al. 2008), and result in very high 
user override rates (Shah et al. 2006; van der 
Sijs et al. 2006; Weingart et al. 2003). Critical 
research questions need to focus on the poten- 
tial mismatch between the user’s mental 
model or intent and the application design, 
use case, or workflow model (Zhang and 
Waljı 2011; Patel et al. 2010). Further atten- 
tion needs to be given to basic principles of 
human-factors engineering, such as the use of 
colors and layout within the application inter- 
face. Additional questions remain regarding 
the ideal design of methods and controls with 
which a user might interact to choose a medi- 
cation from a long list, or identify and encode 
patient problems. More advanced research 
will enable visualization and decision mak- 
ing by matching problems with care plans, 
and facilitation of continuity and coordina- 
tion of care based on underlying CDS rules 
and guideline-based workflows. Especially 
challenging is addressing the need for struc- 
tured data to support clinical decision sup- 
port and quality reporting, in a manner that 
is efficient for the end-user. This goal might be 
achieved by combining structured documen- 
tation during data entry, and natural language 
processing for data abstraction from the clini- 
cal narrative. Most important, however, are 
methods to direct CDS to the right user, at 
the right time, in the right workflow, with the 
right level of alerting or intervention, and the 
right information (Osheroff et al. 2012). 


24.5.6 Data-Driven CDS 


As we have discussed, a major area of research 
in informatics concerns methods for deriving 
knowledge from large data sets using a variety 
of techniques. With the adoption of health IT 
broadly, investigators are drawing on large- 
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scale data-mining methods to provide CDS 
for population monitoring, public health sur- 
veillance, and even to offer patient-specific 
recommendations based on cohort data when 
there is no specific evidence that could other- 
wise guide therapy. With the increasing avail- 
ability of data from diverse sources relevant 
to patient care, large data sets may be created 
and used for both discovery of previously 
unknown associations, and novel clinical pre- 
dictions (Frankovich et al. 2011; Longhurst 
et al. 2014). Critical research questions here 
will include how to define like cohorts of 
patients, how to structure and frame the index 
decision, what methods to use to assess the 
likelihood of alternate prediction scenarios, 
and how to model and elicit the patient’s 
preferences for each scenario. The Institute 
of Medicine (2011b) articulated a long-term 
vision for a Learning Health System, in which 
clinical and administrative data of all kinds 
will begin to inform and enhance clinical 
practice on a national level in a wide variety 
of ways. 


24.6 Conclusions 


The future of CDS systems inherently 
depends on progress in developing useful 
computer programs and in reducing logisti- 
cal barriers to implementation. Although 
ubiquitous computer-based decision aids that 
routinely assist clinicians in most aspects of 
their practice are currently the stuff of science 
fiction, progress has been real and the poten- 
tial remains inspiring. Early predictions about 
the effects that such innovations would have 
on medical education and practice have not 
yet come to pass (Schwartz 1970), but grow- 
ing successes support an optimistic view of 
what technology will eventually do to assist 
practitioners with the processing of complex 
data and knowledge. The research challenges 
have been identified much more clearly, leg- 
islative mandates are creating not only new 
financial incentives but also the practical sub- 
strate of increased EHR adoption and con- 
vergence toward data interoperability, and 
the implications for health-science education 
are much better understood. The basic com- 
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puter literacy of health professional students 
can be generally assumed, but health-science 
educators now must teach the conceptual 
foundations of biomedical informatics if their 
graduates are to be prepared for the techno- 
logically sophisticated world that lies ahead. 

Equally important, we have learned much 
about what is not likely to happen. The more 
that investigators understand the complex 
and changing nature of medical knowledge, 
the clearer it becomes that trained practitio- 
ners of biomedical informatics will always be 
required as participants in fostering a coop- 
erative relationship between physicians and 
computer-based decision tools. There is no 
evidence that machine capabilities will ever 
equal the human’s ability to deal with unex- 
pected situations, to integrate visual and audi- 
tory data that reveal subtleties of a patient’s 
problem, to work with patients to incorpo- 
rate their values and priorities in care plans, 
or to deal with social and ethical issues that 
are often key determinants of proper medical 
decisions. Considerations such as these will 
always be important to the humane practice 
of medicine, and practitioners will always 
have access to information that is meaning- 
less to the machine. Such observations argue 
cogently for the discretion of health-care 
workers in the proper use of decision-support 
tools. 


(e) Suggested Readings 

Bright, T. J., Wong, A., Dhurjati, R., Bristow, E., 
Bastian, L., Coeytaux, R. R., Samsa, G., 
Hasselblad, V., Williams, J. W., Musty, M. D., 
Wing, L., Kendrick, A. S., Sanders, G. D., & 
Lobach, D. (2012). Effect of clinical decision- 
support systems: A systematic review. Annals 
of Internal Medicine, 157(1), 29-43 This thor- 
ough analysis of studies of CDS systems dem- 
onstrates that there is good evidence that CDS 
technology can alter clinician behavior in pos- 
itive ways, but that evidence that CDS systems 
can improve long-term patient outcomes is 
still inconclusive. The paper is also useful for 
its comprehensive bibliography. 

Greenes, R. A. (Ed.). (2014). Clinical Decision 
Support, 2nd Edition: The Road to Broad 


Adoption. New York: Elsevier This book 
offers a comprehensive discussion of the 
nature of medical knowledge and of informa- 
tion technology to assist with medical decision 
making. It provides detailed discussions of the 
computational, organizational, and strategic 
challenges in the design, development, and 
deployment of CDS systems. 

Institute of Medicine. (2011). Digital infrastruc- 
ture for the learning health system: The foun- 
dation for continuous improvement in health 
and healthcare. Workshop Series Summary. 
Washington, DC: The National Academies 
Press. This monograph summarizes the vision 
for a national Learning Health System and 
offers the perspective of a wide range of 
thought leaders on the work required to 
achieve that vision. 

Ledley, R., & Lusted, L. (1959). Reasoning foun- 
dations of medical diagnosis. Science, 130, 
9-21 This is the paper that started it all. This 
classic article provided the first influential 
description of how computers might be used 
to assist with the process of diagnosis. The 
flurry of activity applying Bayesian methods 
to computer-assisted diagnosis in the 1960s 
was largely inspired by this provocative paper. 

Sittig, D. F, Wright, A., Osheroff, J. A., 
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@ Questions for Discussion 
1. Some researchers in medical AI have 
argued that CDS systems should rea- 
son from clinical data in a way that 
closely matches the reasoning strate- 
gies of the very best clinical experts, as 
such experts are the most clever diag- 
nosticians and the most experienced 
treatment specialists that there are. 
Other researchers maintain that expert 
reasoning, no matter how excellent, is 
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at some level inherently flawed, and 
that CDS systems must be driven from 
the mining of large amounts of solid 
data. How do you account for the 
apparent difference between these 
views? Which view is valid? Explain 
your answer. 

Transitioning CDS systems from one 
clinical setting to another has always 
been problematic. The Leeds Abdominal 
Pain System was installed in several 
major clinical settings, and yet the sys- 
tem never performed as well elsewhere 
as it had done in Leeds. The Arden 
Syntax, created expressly to facilitate 
knowledge sharing across institutions, 
failed to meet this goal to a significant 
degree. Why kinds of setting-specific 
factors make it difficult to transplant 
decision-support technology from one 
environment to another? What kinds of 
research might lead to better methods 
for knowledge sharing in the future? 

In one early evaluation study, the 
decision-support system ONCOCIN 
provided advice concerning cancer ther- 
apy that was approved by experts in only 
79% of cases (Hickam et al. 1985). In 
another study, the HyperCritic CDS sys- 
tem for the management of hyperten- 
sion offered the same comments that 
were generated by a panel of experts in 
only 45% of cases (Van der Lei et al. 
1991). Even today, such system perfor- 
mance is fairly typical for computer pro- 
grams that suggest patient therapy. Do 
you believe that this performance is ade- 
quate for a computational tool that is 
designed to help physicians to make 
decisions regarding patient care? What 
problems might CDS systems encounter 
as their developers attempt to make the 
systems more comprehensive in the 
advice that they offer? Why might it be 
more difficult for computer systems to 
offer acceptable recommendations for 
patient therapy than seems to be the 
case for diagnosis? What safeguards, if 
any, would you suggest to ensure the 
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proper use of such systems? Would you 
be willing to visit a particular physician 
if you knew in advance that she made 
decisions regarding treatment that were 
approved by expert colleagues less than 
80% of the time? If you would not, what 
level of performance would you con- 
sider adequate? Justify your answers. 

A large international organization once 
proposed to establish an independent 
laboratory—much like Underwriters 
Laboratory in the United States—that 
would test CDS systems from all ven- 
dors and research laboratories, certify- 
ing the effectiveness and accuracy of 
those systems before they might be put 
into clinical use. What are the possible 
dimensions along which such a labora- 
tory might evaluate decision-support 
systems? What kinds of problems might 
such a laboratory encounter in attempt- 
ing to institute such a certification pro- 
cess? In the absence of such a 
credentialing system for CDS systems, 
how can health-care workers feel confi- 
dent in using a clinical decision aid? 
There is considerable untapped poten- 
tial for CDS to help in managing 
patients with multiple complex condi- 
tions. What are the challenges in dealing 
with such patients, and how can CDS be 
helpful? What are the features required 
of an algorithm that might integrate rec- 
ommendations from the multiple 
clinical-practice guidelines that a CDS 
system could apply? 

CDS is often implemented poorly, 
resulting in dissatisfaction, if not out- 
right annoyance. What are the human 
factors that need to be taken into con- 
sideration in implementing CDS effec- 
tively? Discuss issues and approaches 
to enhancing usability. What are situa- 
tions in which graphics and visualiza- 
tion might be used? How can CDS be 
used to enhance rather than to impede 
workflow? What are strategies to help 
avoid unintended consequences of 
poorly implemented CDS? 
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Digital Technology in Health Science Education 


(e) Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= How can computers improve the deliv- 
ery of in-class and self-learning, as well 
as in-practice learning? 

= How can different approaches to learn- 
ing be implemented using computers? 

= How can simulations supplement stu- 
dents’ exposure to clinical practice? 

= What are the issues to be considered 
when developing computer-based edu- 
cational programs? 

= What are the significant barriers to 
widespread integration of computer- 
aided instruction into the medical cur- 
riculum? 


25.1 Introduction 

The application of digital technology to health 

science education is a sub-field of biomedical 

informatics. It includes the application of all 

aspects of information and computer technol- 

ogy to the content and delivery of education, 

as well as to research on the improvement and 

efficacy of education. Healthcare requires 

constant learning, with its practice in a multi- 

disciplinary team environment in an informa- 

tion-rich world. Digital technology offers new 

approaches to learning that: 

= increase engagement and retention of 
knowledge, 

= allow personalization of knowledge deliv- 
ery, 

= enhance collaboration through connectiv- 

ity, 

support learning any time and anywhere, 

make available the increasing volume of 

knowledge, 

= support learning of evidence-based clini- 
cal practice, and 

= enhance research through collection and 
analysis of large volumes of learner data. 


In this chapter, we first discuss approaches to 
teaching and learning with digital technology. 
That section includes material about theories 
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of learning, digital technologies for learning 
environments, and an overview of learner 
audiences. We then transition to digital learn- 
ing systems, which includes learning man- 
agement systems, learning content creation 
systems, just-in-time learning systems and 
performance support, usability and accessi- 
bility, interoperability standards, digital con- 
tent and assessment of learning. Finally, we 
discuss future directions and challenges for 
the application of digital technology in health 
science education. 


25.2 Approaches to Teaching 
and Learning with Digital 
Technology 


The continual rapid increase in health sci- 
ences knowledge requires a shift in learning 
methods both by the health sciences student 
as well as the health professional. Decades 
earlier, memorization and recall of facts 
were a primary, and sufficient, learning goal. 
Current learning approaches require learning 
the basic concepts and methods of a discipline 
but, in addition, emphasize the ability to inte- 
grate knowledge and to solve problems in the 
context of everyday healthcare. 


25.2.1 Theories of Learning 


Understanding how digital technology can 
support learning in the health sciences begins 
with an appreciation of how people learn. 
This understanding provides a foundation for 
thinking about the learning process and how 
it is shaped by context, purpose, goals, com- 
plexity, and the diversity of learners. 

At its most basic, learning involves a 
change in how a learner perceives and under- 
stands some part of their world. The term 
schema is commonly used, in cognitive sci- 
ence, to describe the cognitive frameworks 
the people use to organize the information 
they have and their beliefs about a particu- 
lar concept, activity, or experience. Schemas 
help us quickly assess a situation and act 
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appropriately. When individuals encounter 
new information, they try to incorporate it 
into their existing schema to enhance their 
understanding. 

If new information contradicts the learn- 
er’s existing knowledge or beliefs, the learner 
adjusts in one of two ways. They may accept 
the new information as valid, and modify their 
schema accordingly. Alternatively, they judge 
the new information as invalid, unimport- 
ant, or irrelevant, and they do not adapt their 
schema to account for this new information. 
When strong existing schemas keep individu- 
als from accounting for new, valid informa- 
tion it can be very difficult for them to make 
changes such as altering unsafe procedures or 
adopting new safety protocols. To encourage 
such knowledge and behavioral change, it is 
important to explicitly address existing mis- 
conceptions and support learner motivation 
to change. 

Changes in schema presume that indi- 
viduals actively construct their own meaning 
through the interactions they have with infor- 
mation, other people, and the environment 
around them. This constructivist approach 
argues that individuals are not blank slates 
and bring their own history and experience to 
every learning situation. It is also important 
to recognize that this construction is an inher- 
ently social and situated process. Often learn- 
ers are learning together in classes, groups, or 
teams. The conversations and shared expe- 
riences with others provide interpersonal 
cognitive and affective context for the learn- 
ing process, and can shape the direction and 
scope of knowledge construction. 

Digital technology can support a variety of 
approaches to learning. A common approach 
is didactic teaching, a one-way transfer of 
information through lectures, presented online 
via technologies such as recorded digital video. 
This approach has the advantage that new, as 
well as remedial, material can be made avail- 
able, with additional links to in-depth con- 
tent. A considerable amount of the available 
digital content is didactic, though it is usually 
enhanced with various activities for active 
learning. Active learning approaches, on the 
other hand, focus on engaging learners in the 
learning process by having them interact with 


the content, with each other, and in reflection 
on what and how they are learning. Research 
has shown that instructional approaches that 
promote active learning consistently outper- 
form transmission-only approaches at a statis- 
tically significant level (Freeman et al. 2014). 

Flipping classes is a relatively recent strat- 
egy for encouraging active learning. Instructors 
flip the class when they provide the instruction, 
traditionally delivered through in-class lecture, 
online and the class time is devoted to active 
learning which replaces the majority of tra- 
ditional homework. The homework becomes 
doing what needs to be done to prepare for the 
in-person class. Flipped classes are similar to 
hybrid or blended classes where the seat time 
that would be used for lecture is focused on 
active learning instead. Faculty engage with 
learners through case discussions, problem 
solving, and deep dives to further understand 
the content learned outside the classroom. 


25.2.2 Digital Technologies 
for Learning Environments 


Much, if not most, of today’s learning con- 
tent is delivered digitally. This is evident in 
the tools that are used within and outside 
of the classroom, for on-the-job training, in 
specialized learning facilities and elsewhere. 
In the classroom, learning content such as 
PowerPoint slides, Prezi presentations, web- 
sites, simulations, games and other digital 
media are often projected using digital pro- 
jectors or delivered directly to the devices of 
learners, such as smartphones, tablets and 
laptops. Audience interaction methods, such 
as Web-based surveys, shared digital white- 
boards or student/audience response systems, 
can help learners interact with the learning 
content, the instructor(s) or fellow learners. 
The classroom technologies and device 
ecology accessible to teachers and learners 
can extend the learning experience seamlessly 
beyond the classroom. For instance, video- 
conferencing allows individuals to attend 
lectures remotely, raise their hand and ask a 
question. Depending on the teaching style, 
this remote participation may approximate 
face-to-face attendance fairly closely or fall 


Digital Technology in Health Science Education 


O Fig.25.1 A life-size 
reconstruction of a digital 
human is viewed in a horizontal 
computer screen or “digital 
table”. With finger gestures, 
learners can identify structures, 
remove layers of tissue to make 
vessels, nerves and bones visible, 
or rotate the body. Additional 
functions at the edges of the 
table allow further viewing and 
measurement functions. Clinical 
cases, with anatomy 
reconstructed from radiologic 
images, such as CT and MRI, 
allow study of actual cases. 
(Courtesy of Anatomage Inc., 
with permission) 


far short of it. Collaborative technologies, 
such as instant messaging, group chats or 
collaborative editing tools, can help bridge 
physical separation, and enable efficient and 
effective virtual group work. 

Such collaborative work can also happen 
asynchronously and provide flexibility (within 
limits) to learner schedules. Discussion lists, 
messaging boards and collaboration sites 
enable episodic, time-independent contribu- 
tions from and interactions with learners. 

Some time ago, using advanced technolo- 
gies, such as simulations, virtual reality and 
augmented reality, required a trip to a spe- 
cialized facility, such as a simulation center or 
“virtual reality cave.” However, with devices 
such as Oculus or HTC Vive, such experiences 
are now available almost anyplace. 

Last, social media such as Facebook, 
Google Hangouts, Twitter, Snapchat and 
Instagram have found important uses in edu- 
cation, ranging from real-time updates on 
projects to sharing experiences on a field-trip. 


25.3 Overview of Learner 
Audiences 


As discussed above, education in medicine, 
nursing, pharmacy, dentistry and other health 
professions is shifting from a focus on knowl- 
edge acquisition to competency-based edu- 
cation (Englander et al. 2013). Educational 
technologies are well positioned to support this 
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transition, but only if they take the specific con- 
text of learners and their goals into account. 
We therefore discuss learner audiences and 
their particular needs in the following sections. 


25.3.1 Undergraduate 
and Graduate Health Care 


Professions Students 


Basic science programs in medical schools 
were among the first to implement technology- 
supported learning. Visually rich content in 
anatomy, neuroanatomy and pathology was 
much more accessible on the computer than 
through the microscope or via the cadaver 
dissection room (@ Fig. 25.1). Excellent 3D 
learning programs for anatomy are available, 
such as Netter 3D Anatomy, Primal Pictures, 
VH Dissector, Anatomage Table,! and other 
products, providing ever more accurate visual- 
ization of the human in three dimensions. The 
use of microscopes in fields such as histology 
and pathology education has virtually disap- 
peared. Interestingly, in many schools, the use 
of cadavers has seen a resurgence, both as an 
important learning tool and as a rite of pas- 
sage into the health care profession. 


1 Netter 3D Anatomy: > http://netter3danatomy. 
com; Primal Pictures: > https://primalpictures.com; 
VH Dissector: » https://www.toltech.net; Anatom- 
age: > https://www.anatomage.com 
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Nursing schools have moved quickly to 
expand their use of technology in education. 
For instance, digitally-enhanced physical 
mannequins for simulation of realistic nurs- 
ing scenarios are widely-used learning tools. 
Many nursing schools that used to share a 
simulation center with a medical school have 
found their own demand high enough to 
require building their own simulation centers. 

Dental schools often share a part of their 
curriculum with medical schools, and as a 
result use the same or similar learning con- 
tent. However, they also need specialized ana- 
tomical and simulation content for dental and 
craniofacial topics. 3D software for dental 
anatomy is used widely in pre-doctoral den- 
tistry. Historically, simulation of dental pro- 
cedures was practiced using physical objects 
such as chalk or plastic teeth, a practice that 
is still widespread. More recently, high fidelity 
digital simulators have been developed. 

For many years, teaching hospitals had 
patients with interesting diagnostic problems 
such as “unexplained weight loss” or “fever of 
unknown origin”. This environment allowed 
for thoughtful “visit rounds,” at which the 
attending physician could tutor the students 
and house staff, who could then go to the 
library to research the subject. A patient 
stayed in the hospital for weeks as testing 
was pursued and the illness evolved. In the 
modern era of restricted insurance payments, 
managed care, and reduced length-of-stay, 
this opportunity for learning in the hospital 
environment, has vanished for most junior 
students. The typical patient in today’s teach- 
ing hospital is multi-morbid, usually elderly, 
and commonly acutely ill. The emphasis is on 
short stays, with diagnostic problems handled 
on an outpatient basis, and diseases evolving 
at home or in chronic care facilities. Thus, the 
medical student is faced with fewer diagnostic 
challenges suited to their level of knowledge, 
and has little opportunity to see the evolution 
of a patient’s illness over time. 

One response of medical educators has 
been to try to move teaching to the outpatient 
setting; another has been to use problem-based 
learning and computer-simulated virtual 
patients. Problem-based learning (PBL) is 
a pedagogical approach where each small 


group of students is given a clinical problem 
and they engage in discussion to develop an 
understanding of the problem, identify rel- 
evant knowledge, seek required knowledge 
using online and library research, discuss and 
challenge each other’s interpretations, and set- 
tle on a solution to the problem. PBL is widely 
used in undergraduate clinical learning, teach- 
ing students self-directed learning, reflection, 
and teamwork. An interesting analysis of 
PBL by a student is available at Chang (2016). 
Computer-simulated patients allow a full 
range of diseases to be presented and allow 
the learner to follow the course of an illness 
over any appropriate time period. Faculty can 
decide what clinical material must be seen and 
can use the computer to ensure that this core 
curriculum is delivered. Moreover, with the 
use of an “indestructible patient,” the learner 
can take full responsibility for decision mak- 
ing, without concern over harming an actual 
patient by making mistakes. These simu- 
lated patients may be fully virtual, or may be 
computer-enhanced physical mannequins. 


25.3.2 Practicing Health Care 
Providers 


Health Sciences education does not stop after 
the completion of formal training. The science 
of medicine advances at such a rapid rate that 
much of what is taught rapidly becomes obso- 
lete, and it has become obligatory for health 
professionals to be lifelong learners, both for 
their own satisfaction and, increasingly, as a 
formal requirement to maintain their profes- 
sional certification. Therefore, online courses 
and online certification examinations have 
become increasingly common for maintenance 
of certification. An additional advantage of 
online certification is the automatic tracking 
of learner performance, and the accompany- 
ing automatic generation of certificates and 
institutional compliance reports. 

Health professionals are also required to 
demonstrate clinical competence through 
their performance in simulated clinical sce- 
narios. Advanced Trauma Life Support and 
Advanced Cardiac Life Support are some of 
the areas in which clinical competence is dem- 
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J Fig. 25.2 A screenshot of the SimSTAT simulation 
used for Maintenance of Certification by the American 
Society of Anesthesiologists. In a simulation of an oper- 
ating room, the anesthesiologist ventilates the patient 
who is lying on the operating table. The simulation is 
viewed by the learner on a computer screen while the 
learner plays the role of the anesthesiologist. The learner 
guides the on-screen anesthesiologist to care for the 
unconscious patient by clicking on desired interactions, 


onstrated through actual participation in on- 
site scenarios. Online simulation of these and 
other scenarios can be used as preparation for 
testing in a live, crisis scenario. Some special- 
ties, such as anesthesiology, have developed 
sufficiently rich online simulations in their 
specialty, with real time assessment, that they 
can reduce the requirement for use of live sce- 
narios (@ Fig. ). 


25.3.3 Patients, Caregivers, 
and the Public 


For those outside the health science profes- 
sions, there is no systematic way to learn how 
to be a knowledgeable patient or home-based 
caregiver, or how to effectively communicate 
with a health care provider. For the provider 
interacting with the patient, the need to under- 
stand the patient accompanies the need to 
problem-solve. These changes in health care 
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such as the equipment in the room or the icons at the 
bottom of the screen. Through these interactions, the 
learner can control the level of sedation, give medica- 
tions, fluids, and gases, and monitor the patient’s physi- 
ologic status. The inset screen on the top left is a monitor 
for the patient’s vital signs. The inset screen on the top 
right is the display from the anesthesia machine. (Cour- 
tesy of CAE Healthcare, Inc., with permission) 


delivery are occurring slowly, and technology, 
particularly simulation and role playing, will 
be part of this change (Zaharias ). 
Meanwhile, healthcare information is 
widely available to the general public, with the 
most reliable information being on web sites 
affiliated with federal library resources such 
as Medline Plus, academic organizations, 
professional societies or federal health agen- 
cies. Online courses, both free and paid, are 
also available, many from traditional universi- 
ties or other online education organizations. 
Interactive learning applications, however, are 
not widely available to the public though they 
are able to provide more engaging learning. 
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Improving the patient’s health literacy 
has become an important approach to pro- 
viding higher quality health care. Failure to 
comply with medication regimes, exercise 
plans, or hospital discharge instructions are 
a major cause of return visits to the hospital 
or clinic. Patient and family caregiver edu- 
cation, leading to better understanding of 
clinical instructions, could result in a more 
effective partnership between the patient and 
the healthcare provider (Nelson 2016) (see 
> Chap. 11). Online learning resources are 
one approach to alleviating the compliance 
problem. 

In the next section, we take a close look 
at how digital learning content is created and 
delivered. 


25.4 Digital Learning Systems 


25.4.1 Learning Content 


Management Systems 


A Learning Content Management System 
(LCMS) is a software platform that allows 
learning content creators, such as faculty, 
to create, manage, host and track changes in 
digital learning content. Prior to the develop- 
ment of LCMSs, educational content creators 
had to assemble several separate and disparate 
items to create rich, engaging learning content. 
LCMSs, on the other hand, are one-stop plat- 
forms that integrate a wide variety of tools for 
content creation. They overlap with Learning 
Management Systems (LMS, described below) 
in that both support content hosting and 
delivery. However, LCMSs specialize in tools 
to create, manage and update content. 
Personally created course content may be 
a web site or a blog, created by a faculty mem- 
ber on any of a range of web site or blog cre- 
ation tools. Other personal content creation 
tools include tools for quiz item development, 
capturing video lectures or demonstrations, 
and creation of interactive animations or 
games. The only requirement is that the con- 
tent files be compatible with the LMS, so that 
the content can be uploaded to the LMS and 


deployed to all learners without any need for 
special integration programming. 

Collaborative content creation can be 
another powerful approach to learning. As 
opposed to structured content that is created 
by faculty or a similar creator, collaborative, 
learner-generated content is informal and cre- 
ated on the fly, for instance in a discussion 
forum. Structured discussion groups which 
encourage students to provide their thinking 
on the discussion questions, and to comment 
on content from other learners, are useful 
tools to support learning through active par- 
ticipation, argument, and reflection. Video 
conferencing tools, such as Zoom and Skype, 
support real-time discussion and collabora- 
tion, and can be content creation tools if there 
is a repository of the content. 

When the content to be created is 
sufficiently complicated, requires strict adher- 
ence to organizational policy, or requires a 
range of skill sets and significant expense, it 
becomes necessary to approach it at the level 
of the enterprise. For example, the American 
Heart Association has developed courses for 
cardiac life support that are required training 
in the United States and many other coun- 
tries.” These programs are created by teams, 
with each person providing a different skill, 
such as graphic design, programming, or con- 
tent knowledge, rather than by an informal 
collaboration of learners with similar skills. 


25.4.2 Learning Management 
Systems 


A Learning Management System (LMS) is a 
repository of learning content, an interface 
for delivering courses and content to learn- 
ers, and a platform for the course creator or 
administrator to track learner engagement 
and performance. From the learner’s view- 
point, the LMS provides a single login access 
to all courses that they may need. Once within 


3 American Heart Association’s ACLS, BLS, and 
PALS courses: > https://elearning.heart.org/courses 
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a course, the learner can access content, such 
as text, videos, quizzes, games, handouts and 
assignments. The LMS may include admin- 
istration features for the learner to select 
courses, register or join a wait list, pay for 
each course, and track their grades. It may 
also include resource sharing and collabo- 
ration between learners. From the faculty’s 
point of view, the LMS allows uploading and 
modification of course content, as well as a 
dashboard for viewing the performance of 
learners, groups and classes as a whole. LMS 
features may include various statistics, such 
as usage of course components or the perfor- 
mance of the class on individual test items. 
Higher education institutions provide a rel- 
atively structured curriculum to well-defined 
learner populations. Their needs typically 
are served by educational LMS applications 
such as Blackboard Learn, D2L Brightspace, 
Moodle, or Canvas (Dahlstrom et al. 2014). 
Medical centers and corporations often use 
LMSs that are more suited to corporate needs. 
Typically, the most important requirement of 
such an LMS is tracking learner compliance 
with required training, and export of reports 
for regulatory and accreditation purposes. 
There are numerous enterprise-oriented 
LMSs. Some common ones are Captivate 
Prime, TalentLMS, and Totara, but market 
leadership of LMSs changes continually. 


25.4.3 Just-in-Time Learning 
Systems and Performance 
Support 


Learning also occurs outside of formal learn- 
ing contexts. This “just-in-time learning” hap- 
pens on demand, for instance when learners 
need instant information at a critical moment, 
or something goes wrong and they need to 
know what to do next. Instant information 
can provide immediate help, or performance 
support, but can also be considered a learn- 
ing opportunity. A particularly powerful 
approach is to make these tools accessible 
when and where they are likely needed, for 
instance in online help areas of electronic 
medical records. 
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Examples of performance support or just- 
in-time learning tools include: 
= job aids, such as check lists, quick refer- 
ence cards, and handouts; 
= protocols and templates, such as the SBAR 
(Situation, Background, Assessment, 
Recommendation and Request) technique 
for communicating critical information; 
resource or policy documents; 
video and audio recordings, such as brief 
demonstrations of care procedures, par- 
ticularly helpful for home-based care pro- 
viders; and 
= animations, simulations, and learning 
modules that include brief instructions, 
demonstrations, or explanations. 


25.4.4 Interoperability Standards 


The education enterprise process includes 
many parts, such as content, curriculum, 
LMSs, learner profiles, assessment, certi- 
fication, and others. These parts are often 
supported by different tools and platforms 
that need to work together seamlessly. To 
enable this seamless environment, tools and 
platforms need to be interoperable, without 
the need for custom programming for integra- 
tion. In this chapter, we discuss some of the 
interoperability standards in education (see 
> Chap. 7). 

Historically, the commonly used stan- 
dard for such learning object interoperability 
was the Shareable Content Object Reference 
Model (SCORM).* It defines how a content 
object (a course) should be packaged and 
described, how it should be launched, and how 
data should be communicated between the 
LMS and the content package. A SCORM- 
compatible content object can be published 
to and played back from any SCORM- 
compatible LMS. SCORM can report on 
course completion and time spent. SCORM 
was last updated in 2009 and, as a standard, 
has not kept up with changing technology. 
However, it is still one of the most widely used 
standards for learning object interoperability. 


4 SCORM: > https://scorm.com/ 
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The newer xAPI standard (aka TinCan 
API) is much more robust in terms of analyt- 
ics and mobile deployment. The drawback to 
xAPI is that it does not integrate with older 
LMSs and tends to be costly to deploy, often 
requiring professional development assistance. 

With the rapid growth in the use of 
learning management systems in higher and 
continuing education and workplace learn- 
ing, it has become critical to have interoper- 
ability standards that provide integration 
of web-based learning objects and applica- 
tions. Learning Tools Interoperability (LTI)® 
is a standard developed by the IMS Global 
Learning Consortium to provide a means for 
seamless and secure pass-through of student 
credentials and grades between the LMS and 
the external application. LTI tools are usually 
web-based applications written in a server- 
side language which can serve a variety of 
purposes. These include, but are not limited 
to, hosting and serving video with quizzing, 
providing access to interactive learning mate- 
rials from textbook publishers, allowing learn- 
ers to create media, use specialized programs, 
and collaborate in integrated development 
environments (IDE), whiteboarding and mind 
mapping applications, or videoconferencing. 

Other interoperability standards address 
various education services. For example, the 
Medbiquitous’ organization has developed 
a Curriculum Inventory Data Exchange 
Standard that is being used by the Association 
of American Medical Colleges to collect and 
collate curriculum data from all its medical 
schools and map these curricula to compe- 
tency requirements. Other Medbiquitous stan- 
dards include the Educational Achievement 
Standard to document learner competency 
used by numerous medical certification orga- 
nizations, and the Virtual Patient Standard to 
enable exchange of virtual patient simulations 
across institutions. 


5 xAPI: » https://xapi.com/overview/ 

6 LTI: > https://www.imsglobal.org/activity/learn- 
ing-tools-interoperability 

7  Medbiquitous standards: 
standards 


> https://medbiq.org/ 


There has also been an interest in the 
exchange of education components, specifi- 
cally learning content modules, between insti- 
tutions. A number of content collections have 
been developed, with the most well-known 
being MERLOT. Standardized descriptions 
of learning objects, known as Learning Object 
Metadata, were developed, but exchanging 
and repurposing individual learning objects 
did not become commonplace. However, an 
interesting by-product was a standardized 
way to describe items in a content collection, 
to manage a library of learning objects and 
to track learner use of those objects. Another 
example of a repository of shared learning 
resources is the AAMC’s MedEdPORTALS, 
which is peer-reviewed and contains both 
patient cases and clinical scenarios. 


25.4.5 Usability and Access 


Usability and access are important consid- 
erations in developing educational software 
(see > Chap. 5). About 1 in 50 adults have 
some form of vision or hearing disability, and 
need alternate or augmented access to digi- 
tal learning content. Two standards, Section 
508 of federal law (508)? and the World Wide 
Web’s Web Content Accessibility Guidelines 
(WCAG),!° address use of digital content by 
people with disabilities. 

Section 508 specifies that digital informa- 
tion provided by or to the government must 
be accessible if there is “no undue burden”. 
In practice, designing accessible online con- 
tent requires use of techniques available in 
current web design, such as the “Alt Text” tag 
for graphics, and indications to make the user 
interface more visible or audible. Adherence 
to Section 508 becomes more difficult in more 
complicated applications, such as 3D immer- 
sive environments and dynamic simulations, 


8 AAMC’s MedEdPORTAL: > https://www.meded- 
portal.org 

9 Section 508: » https://www.section508.gov/man- 
age/laws-and-policies 

10 WCAG: > https://www.w3.org/WAI/standards- 
guidelines/wcag/ 
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requiring creative solutions to presentation 
and interface requirements. 

WCAG is a set of formal guidelines on 
how to develop accessible web content. It does 
not address non-web digital content. 


25.5 Digital Content 


Digital content, unlike a typical textbook 

or lecture, can be interactive. Three levels of 

interactivity are typically possible: 

= Level 1: The content includes text, graph- 
ics and video but interaction is primarily 
through clicking to move to the next chunk 
of content. This level may include simple 
quizzes such as multiple choice or true/ 
false questions. Much of digital learning 
content consists of web pages or applica- 
tions that incorporate this style of exposi- 
tory material. 

= Level 2: The content includes multimedia 
such as audio, video and animations. The 
interactivity supports simple puzzles and 
games, like sorting and matching. The cost 
of development is higher than for Level 1, 
but the content is more engaging. 

= Level 3: The content presented is very rich, 
including realistic  three-dimensional 
environments and characters, or even 
immersive virtual reality (@ Fig. 25.3). 


O Fig. 25.3 A screenshot of a Level 3 interactive appli- 
cation, BattleCare. The learner can select tools from the 
medical kit on the right, and drag them onto the simu- 
lated patient to clean and compress the wound or to lis- 
ten to heart and lung sounds. (Courtesy of Innovation in 
Learning, Inc., with permission) 
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Interaction happens through games or 
simulations, with the content evolving 
based on the choices and decisions made 
by the learner. 


25.5.1 Text/Image/Video Content 


Much of digital learning content consists of 
web pages or applications that incorporate 
expository material, using text, graphics and 
video, and Level 1 interaction. Although much 
of the focus of computer-based teaching is 
on the more innovative uses of technology to 
expand the range of available teaching modal- 
ities, computers can be employed usefully to 
deliver didactic material, with the advantage 
of the removal of time and space limitations. 
For example, a professor can choose to record 
a lecture and to store the digitized video of 
the lecture as well as related slides and other 
teaching material, and upload this content to 
the institution’s learning management system 
(see Section on LMSs.) This approach has 
the advantage that relevant background or 
remedial material can also be made available 
through links at specific points in the lecture. 
The ease of creating online video lectures has 
led numerous universities and corporations to 
provide libraries of recorded lectures for study 
by learners at their own convenience. 

Many refinements have been developed 
that use technology to optimize the delivery of 
didactic or expository content. Microlearning 
is the presentation of brief segments of con- 
tent, typically ranging from 5 to 15 minutes 
in duration. Spaced repetition is the repeated 
presentation of select content to optimize its 
retention. Mastery learning is a process of 
testing the learner for competence in a seg- 
ment of content before they are allowed to 
progress to the next. The Khan Academy, !! 
which includes healthcare content in its cata- 
log, uses brief videos to teach single or small 
groups of concepts. In this micro-learning 
approach, the learner can select and complete 


11 Khan Academy, Health and Medicine: » https:// 
www.khanacademy.org/science/health-and- 
medicine 
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topics with a limited investment of time, and 
can demonstrate mastery. 

Massively Online Open Courses 
(MOOCs) bring free-to-view, world-class 
university courses to a global audience. The 
first major MOOC, Introduction to Artificial 
Intelligence, launched with an astonishing 
160,000 subscribers. The structure of the 
first courses was similar to a typical univer- 
sity course, with lectures released at the same 
time as they would be taught to an in-person 
class on campus, along with assignments and 
final examinations that needed to be turned 
in on time. Course support was provided 
by peer support through student discussion 
groups. Some MOOCs now require fees for 
certification of completion of courses. Private 
companies have sprung up providing sup- 
port to students around selected MOOC, an 
indication of the ecosystems that develop 
around interesting technologies. EdX is a 
MOOC delivery platform launched by the 
Massachusetts Institute of Technology and 
Stanford University. Coursera, Udacity 
and FutureLearn are major private MOOC 
platforms. !? 


25.5.2 Interactive Content 


Teaching programs differ in the degree to 
which they impose structure on a teaching ses- 
sion. In general, drill-and-practice systems are 
highly structured. The system’s responses to 
students’ choices are specified in advance; stu- 
dents cannot control the course of an inter- 
action directly. In contrast, other programs 
create an exploratory environment in which 
students can experiment without guidance or 
interference. For example, a neuroanatomy 
teaching program may provide a student 


12 Mulgan G, Joshi R. Clicks and mortarboards: how 
can higher education make the most of digital tech- 
nology? November 2016. » https://media.nesta. 
org.uk/documents/higher_education_and_technol- 
ogy_nov16_.pdf 


with a fixed series of images and lessons on 
the brainstem, or it may allow a student to 
select a brain structure of interest, such as a 
tract, and to follow the structure up and down 
the brainstem, moving from image to image, 
observing how the location and size of the 
structure changes. 

Each of these approaches has advantages 
and disadvantages. Fixed path learning pro- 
grams ensure that no important fact or con- 
cept is missed, but they do not allow students 
to deviate from the prescribed course or to 
explore areas of special interest. Conversely, 
programs that provide an exploratory envi- 
ronment and allow students to choose any 
actions in any order encourage experimenta- 
tion and self-discovery. Without structure or 
guidance, however, students may waste time 
following unproductive paths and may fail to 
learn important material, resulting in ineffi- 
cient learning. 

An example is the Tooth Atlas, used in 
dentistry. Understanding the three-dimen- 
sional structure of teeth is important for clini- 
cal dentistry. The key instructional objective 
of the program is to help students learn the 
complex external and internal anatomy of 
the variety of teeth in three dimensions. The 
rich, interactive 3D visualizations show teeth 
as they would be visually perceived as well 
as through very high resolution computed 
tomography scans, radiographs and physical 
cross-sections. The learners can rotate and 
section the computed models, and can con- 
trol the transparency of each structure so as 
to study inter-relationships. While the visu- 
alization is highly exploratory, the embedded 
pedagogy is very structured, consisting of 
detailed textual quizzes with multiple-choice 
answers. 


25.5.3 Games 


A learning game places learning content 
within a digital video game. The game play 
experience engages and entertains the learner 
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while certain steps in the game instill desired 
content knowledge. In a learning game, the 
learning content is embedded in the game. 
Gamification, on the other hand, has ele- 
ments such as a score, achievement badges, 
or a “leader board”, to add excitement to an 
otherwise routine learning experience." 

A game has the following components: a 
goal, such as finding the best treatment for a 
patient; a setting, such as a three-dimensional 
rendition of an emergency department; game 
play, such as the information, tests, proce- 
dures, and medications available; and game 
mechanics, such as accessing game play ele- 
ments by drawing up medication in a syringe 
or selecting a medication dosage from a menu. 
Successful resolution of a clinical problem 
can give the same satisfaction as an enjoyable 
game. However, to be considered a game, there 
need to be challenges, such as conquering 
“enemies” or accomplishing “quests”, evolv- 
ing clinical problems, or a restricted avail- 
ability of supplies and personnel, that must 
be overcome, as well as a clear criterion of 
success. To be a learning game, actions during 
game play should result in learning, either by 
exposure of a nugget of information, by feed- 
back from a mentor embedded in the game, 
or by trying alternative medical approaches to 
find an effective treatment. 

Numerous learning games have been devel- 
oped for all aspects of healthcare education 
but the evidence for their efficacy is not clear 
(Gorbanev et al. 2018). Funding for efficacy 
research is limited, and is one reason for the 
paucity of rigorous studies (Reed et al. 2007). 
Game development that has a clear learning 
goal, and has been informed by research dur- 
ing the design stage as well as during devel- 
opment of the game play, has been shown 
to be both engaging and an effective learn- 
ing tool (Kato et al. 2008) (@ Box 25.1 and 
O Fig. 25.4a and b). 


13 A leader board is a list of players with the highest 
scores. Players compete to be among the high 
scorers. 
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25.5.4 Cases, Scenarios 
and Problem-Based Learning 


The learner is presented with a story that 
includes a clinical problem. The presentation 
may be only in text, with text and graphics, in a 
near-realistic three-dimensional environment, 
or even in an immersive virtual environment, 
with correspondingly varying levels of inter- 
activity. The learner’s role may be constrained 
such that the learner knows who they represent, 
what resources are available, and what prob- 
lem must be solved. Alternatively, the learner 
may be required to investigate the situation 
(examine the patient), define the problem, find 
any supporting resources (what imaging and 
laboratory tests are available or what learning 
resources are at hand) and guide the scenario 
to an end goal. As the learner proceeds, the 
scenario evolves on the computer, based on the 
actions taken and the progress of time. 

Prognosis is a case-based program with 
over 500 cases covering most specialties 
(0 Fig. 25.5). Each case begins with a brief 
story of the clinical presentation. The learner 
must choose among available tests, diagnoses 
and treatments, and then receives feedback on 
the choices made, as well as the preferred or 
optimal choices. The presentation and inter- 
activity are very simple, and the cases brief, 
but the engagement provided by the clini- 
cal puzzle has made this a popular program 
among medical students and residents. 

An approach that combines the benefits 
of exploration with the constraint of a linear 
path through the material is one that breaks 
the evolving scenario into a series of short 
vignettes. A situation is presented, information 
and action options are available, and a decision 
must be made. Each decision triggers the pre- 
sentation of the next vignette. This could lead 
to a branching story line but, usually, the next 
vignette presents the result of the best actions 
from the previous vignette. A scenario about 
a virtual patient could have vignettes that lead 
the learner through the steps of diagnosis, tests, 
and the course of treatment. This approach is 
commonly used in computer-based testing of 
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Box 25.1 “Re-Mission: Fighting Cancer with Video Games” (> http://www.re-mission2.org/) 
Re-Mission 2 games help kids and young adults with cancer take on the fight of their lives. Based 
on scientific research, the games provide cancer support by giving players a sense of power and 
control, and encouraging treatment adherence. Each game puts players inside the human body to 
fight cancer with an arsenal of weapons and super-powers, like chemotherapy, antibiotics and the 
body’s natural defenses. The game play parallels real-world strategies used to successfully destroy 
cancer and win. 

Re-Mission 2 games are designed to: 

- Motivate young cancer patients to stick to their treatments by boosting self-efficacy, foster- 
ing positive emotions and shifting attitudes about chemotherapy. These factors were key 
drivers of the positive health behavior seen with the original Re-Mission game 

- Appeal to a broad audience by offering a variety of gameplay styles; and 

- Tap into the popularity of casual games, playable in short bursts or at length, to provide 
cancer treatment support through fun, engaging play. 

The games incorporate key insights from years of scientific studies and qualitative user research 

with adolescent and young adult cancer patients. An outcomes study showed that the original 

Re-Mission game improved treatment adherence and boosted self-efficacy in young cancer 

patients. The Re- Mission Attitudes Study in the Brain used {MRI technology to show how inter- 


active gameplay impacts the brain to motivate positive behavior change (Kato et al. 2008). 


O Fig. 25.4 Screenshots from opening screens of 
Re: Mission 2. a In “Nanobot’s Revenge,” players use 
an ever-increasing arsenal of powerful chemo attacks 
to crush the cancerous forces of the Nuclear Tyrant, 
firing targeted treatments on a growing tumor to pre- 


clinical knowledge where assessment of learner 
performance would be extremely difficult if the 
interactions were completely unconstrained. 
The ability of the computer to track 
and store the learner’s actions allows post- 
processing and analysis of the tracked data. 
An interesting analytic capability is one that 
compares the performance of novice learn- 
ers and experts to detect features that define 
expert information gathering or action 
sequences. Stevens et al. (1996) compared the 
information gathering and the conclusions of 


vent cancer from escaping into the blood stream. b In 
“Nano Dropbot”, the player continues to kill cancer 
cells but also learns to recruit healthy cells in the 
fight 


novices and experts on a set of immunological 
cases. Using neural nets to process the track- 
ing data, they detected consistent differences 
in the problem solving approach of novices 
compared to experts. In particular, novices 
exhibited considerably more searching and 
lack of recognition of relevant information, 
while experts converged rapidly on a com- 
mon set of information items. The potential 
of using such expert patterns of performance 
to educate novice learners has not been widely 
explored in the health sciences. 
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Yellow 2 


Oj O 
Clinicals Investigations 


A 60-year-old woman presents with persistent 
jaundice, pruritus, and dark urine for two weeks, in a 
background of nausea, vomiting, and intermittent 
upper abdominal pain for four weeks, and a 15kg 
weight loss over three months. 


Her medical, surgical, allergic, and family histories 
are unremarkable. She only drinks socially, does not 
smoke, and has never used recreational drugs. 
There is no history of recent foreign travel. 


A complete blood count is found to be within normal 
parameters. 


O Fig. 25.5 Screenshot from the case-based program, 
Prognosis. The learner is presented with a summary of 
the case. The simple graphic presents physical examina- 
tion information in a similar format for each case. The 
following screen offers options for laboratory, imaging 
and other investigation options. The learner selects the 


25.5.5 Simulations - Virtual 
Patients 


Clinical training has been shown to benefit from 
the use of simulations to engage the learner 
(Gaba 2004; Aebersold 2018; Jeffries 2005). 
Learning is most effective when the learner is 
engaged and actively involved in decision mak- 
ing. The use of a simulated patient presented 
by the computer can approximate the real- 
world experience of patient care and focuses 
the learner’s attention on the subject being pre- 
sented (Huang et al. 2007). The Association of 
American Medical Colleges has prepared an 
informational summary of the value of and 
the issues around issues of using simulation for 
education.'4 

Talbot et al. (2012) present an analysis of the 
range of presentation and interactivity avail- 


14 » https://www.aamc.org/download/373868/data/ 
technologynowsimulationinmedicaleducation.pdf 


(©) 
Management Finish 


Vital signs: stable General examination: icteric 
Heart, lungs, CNS: 
no abnormalities 


Abdomen: 
Mild tenderness over the right upper quadrant 
Vague mass palpable in the same region 


management plan and receives summary feedback on 
the success or failure of the plan. The case ends with a 
review of the disease background and optimal manage- 
ment, including relevant references. (Courtesy of Medi- 
cal Joyworks, LLC, with permission) 


able in Virtual Patients. These simulations may 
be either static or dynamic. Under the static 
simulation model, each case presents a patient 
who has a predefined problem and set of char- 
acteristics. At any point in the interaction, the 
student can interrupt data collection to ask the 
computer-consultant to display the differential 
diagnosis (given the information that has been 
collected so far) or to recommend a data col- 
lection strategy. The underlying case, however, 
remains static. Dynamic simulation programs, 
in contrast, simulate changes in patient state 
over time in response to students’ diagnostic 
or therapeutic decisions. Thus, unlike in static 
simulations, the clinical manifestations of the 
dynamic simulation can be programmed to 
evolve as the student works through them. 
These programs help students to understand 
the relationships between actions (or inac- 
tions), and patients’ clinical outcomes. To 
simulate a patient’s response to intervention, 
the programs may explicitly model underly- 
ing physiological processes and use math- 
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O Fig. 25.6 Screenshot 
from Timeout Training, a 
mobile training application. 
This illustrates a time-out 
dialog between the learner 
(a resident) and the nurse, 
prior to initiating a 
thoracocentesis 
intervention. The learner 
selects from one of the 
presented dialog options. 
Careful design of dialog 
options can help in learning 
nuances of dialog 
possibilities as well in 
proper sequencing of a 
dialog. (Courtesy of 
Innovation in Learning, 
Inc., with permission) 


ematical models. An example of a dynamic 
simulation of a virtual patient is SimSTAT 
(see @ Fig. 25.2), an operating room simula- 
tion that is used by the American Society of 
Anesthesiologists to train practicing anesthe- 
siologists in the diagnosis and management of 
crises in the operating room. 

Virtual Patients can be as simple as 
Prognosis, described above, or can be richly 
complex, simulating a complete encounter 
with a patient in a clinic or hospital room. 
Simulation of a conversational interaction 
with the patient or another character can be 
an important aspect of learning using a vir- 
tual patient (@ Fig. 25.6). 


25.5.6 Simulations - Procedures 
and Surgery 


Procedure trainers or part task trainers have 
emerged as a new method of teaching, par- 
ticularly in the teaching of surgical skills. This 
technology is still under development, and 
it is extremely demanding of computer and 
graphic performance. Early examples have 
focused on endoscopic surgery and laparo- 
scopic surgery in which the surgeon manipu- 
lates tools and a camera inserted into the 
patient through a small incision. In the simu- 
lated environment, the surgeon manipulates 
the same tool controls, but these tools control 


RESIDENT-NURSE DIALOGUE | 


simulated instruments that act on computer- 
graphic renderings of the operative field. 
Feedback systems inside the tools return pres- 
sure and other haptic sensations to the sur- 
geon’s hands, further increasing the realism of 
the surgical experience. 

Commercially available trainers are now 
in use for many surgical procedures. For 
example, the 3D Systems company provides 
a line of Simbionix simulators for training in 
laparoscopy, endoscopy, and hysteroscopy.!> 
Other simulators are now available for all lev- 
els of surgery, beginning with training in the 
basic operations of incision and suturing, and 
going all the way to robotic surgery. 


25.5.7 Simulations - Mannequins 


Physical simulations of a patient, in an 
authentic environment such as an operating 
room, have evolved into sophisticated learn- 
ing environments. The patient is simulated by 
an artificial mannequin with internal mecha- 
nisms that produce the effect of a breathing 
human with a pulse, respiration, and other 
vital signs (@ Fig. 25.7). In high-end simula- 
tors, the mannequin can be given blood trans- 


15 » https://www.3dsystems.com/healthcare/medical- 
simulators 
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fusions or medication, and its physiological 
response changes based on these treatments. 
These human patient simulators are now used 
around the world both for skills training and 
for cognitive training such as crisis manage- 
ment or leadership in a team environment 


O Fig.25.7 This plastic mannequin simulates many of 
the functions of a living patient, including eye opening 
and closing, breathing, heart rate and other vital signs. 
Gases, medications, and fluids can be administered to 
this mannequin, with resulting changes to its simulated 
vital signs. (Courtesy of Parvati Dev, with permission) 
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(© Fig. 25.8). The environment can represent 
an operating room, a neonatal intensive care 
unit, a trauma center, or a physician’s office. 
Teams of learners play roles such as surgeon, 
anesthetist, or nurse, and practice teamwork, 
crisis management, leadership, and other cog- 
nitive exercises. 

A seminal study by Hayden et al. (2014) 
showed that 50% of nurse clinical training, 
in the Bachelor’s program, could be replaced 
by training on mannequin simulators. This is 
particularly important both because of the 
range of cases that can be presented on the 
simulator, and because of the difficulty in 
obtaining clinical training time in hospitals. 


25.5.8 Virtual Worlds 


An extension of the physical human patient 
simulator is the virtual world simulation, with 
a virtual patient in a virtual operating room 
or emergency room. Learners are also present 
virtually, logging in from remote sites, to form 
a team to manage the virtual patient. Products 
such as 3DiTeams and Health TeamSpaces are 
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O Fig. 25.8 Three-dimensional computer-generated 
virtual medical environments are used to present clinical 
scenarios to a team of learners. Each learner controls a 
character in the scenario and, through it, interacts with 
devices, the patient, and the other characters. Learning 


goals may include medical goals such as stabilization of 
the patient, communication goals such as learning to 
point out a problem to senior personnel, or team goals 
of leadership and delegation. (Courtesy of Innovation 
in Learning, Inc. with permission) 
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O Fig. 25.9 A combined 
image depicting a learner 
wearing virtual reality 
glasses, and the scene 
visible to the learner. The 
learner feels she is inside 
the operating room, 
viewing the procedure. 
(Courtesy of SimTabs, 
LLC., with permission) 


being used to construct and deliver team train- 
ing in such virtual medical environments. !6 


25.5.9 Virtual Reality 


The use of virtual reality glasses, along with 
spatially accurate sound and virtual hands, 
creates an immersive experience that surpasses 
the experience of a three-dimensional world 
as seen on a computer screen on the learner’s 
desk. The learner feels truly “inside” the expe- 
rience. The resulting immediacy is so real that 
it manifests itself through physiologic changes 
such as a speeding of the learner’s pulse and a 
total lack of awareness of the actual physical 
surroundings (@ Fig. 25.9). 

There are two types of virtual reality in use 
at present. One is reality as represented by a 
completely synthetic three-dimensional envi- 
ronment, within which the learner navigates 
and acts. The other is represented as a 360° 
video of a real environment within which the 
clinical action has been recorded. The video 
virtual reality is useful for didactic training 
about procedures, such as new surgical meth- 
ods, where the learner has a front row view 
as though they were actually present in the 
operating room. 


16 » https://anesthesiology.duke.edu/?page_ 


id=825623, > https://healthteamspaces.simtabs.com 


Examples of simulations using virtual real- 
ity have been demonstrated by many universities 
and organizations. VR simulations of surgery 
(https://ossovr.com) and patient examination 
(https://oxfordmedicalsimulation.com) are in 
use in medical and nursing curricula. There are 
a few studies examining the learning efficacy of 
VR simulations (Kyaw et al. 2019). It is likely 
that the realism of virtual reality, its “face valid- 
ity”, will result in its use in education even if 
rigorous efficacy studies are not available. 


25.5.10 Augmented Reality 


Augmented reality (AR) differs from virtual 
reality in that a real world is seen through the 
AR glasses while other information is over- 
laid on the world by the glasses. Information 
can be textual, such as heart rate and blood 
pressure data when looking at a physical man- 
nequin. It can be graphic, such as an open 
wound seen overlaid on a person on a bed, 
simulating an injured patient. The AR over- 
lay information changes depending on what is 
being viewed, creating a world of information 
on top of the real visible world. 

Learning possibilities using AR are end- 
less. A new nurse can walk into an empty 
operating room and “see” the contents of 
closets and drawers, thus being trained on the 
location of OR supplies. A nursing student 
can see a “pressure sore” evolve on the heel 
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of a real person because of pressure on the 
heel in the bed. A medical student can “scroll” 
through the electronic medical record as they 
talk with a simulated patient. 

AR in medical education is in its infancy 
but its applications are expected to be wide- 
ranging. 


25.5.11 3D Printed Physical Models 


Three dimensional (3D) printing is a novel 
application of printing. Slice data from an 
object, such asa CT image of a bone, is used to 
print a layer of solid material, such as plastic. 
Subsequent image slices are used for printing 
a cumulative stack of plastic slices until the 
entire object has been printed. The advan- 
tage of sequential printing of slices, instead 
of carving the shape from a solid block of 
plastic, is that hollow portions of the origi- 
nal object can be printed as holes in the slice 
data. A second advantage is that very complex 
objects can be printed using this technology. 

3D models are beginning to be used for 
learning. For undergraduate students, cadaver 
dissection, plastinated specimens, and dried 
bone have provided the physical specimens. 
3D models add a new option. For healthcare 
practitioners, patient-specific 3D models help 
in planning procedures, but they also help 
in educating the patient about their upcom- 
ing surgery. In a recent systematic review on 
surgical planning for congenital heart dis- 
ease, the authors found that 25% of the stud- 
ies showed 3D printed models were useful in 
medical education for healthcare profession- 
als, patients, caregivers, and medical students 
(Lau and Sun 2018). 


25.6 Assessment of Learning 


Assessment of student learning compares 
educational performance with educational 
goals. Ideally, the content used for assessment 
resides within a curriculum and an educational 
program that has a clear set of educational 
goals. Therefore, student learning is measured 
(assessed) against these overall goals as well 
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as against the goals of the specific learning 
module. Without these goals, any assessment 
is without direction and its purpose may be a 
mystery to the learner. !7 

Assessment maybe formative (guiding 
future learning and promoting reflection) or 
summative (making a judgment about compe- 
tence or qualification before being allowed to 
advance to the next level of study). Assessment 
can also be used by instructors and educa- 
tion program designers to identify whether 
the learning content can be improved, and 
brought closer to the identified learning goals 
(Epstein 2007). 

Digital technology supports rich assess- 
ment both because of its ability to present 
many types of assessment tools, and because 
of its ability to track learner actions in great 
detail. Selected assessment methods are pre- 
sented below. 


25.6.1 Quizzes, Multiple Choice 


Questions, Flash Cards, Polls 


Quizzes test the learner’s knowledge and, 
depending on the quiz format, the learner’s 
ability to solve problems. Quizzes can be pre- 
sented as questions with single or multiple 
correct answers, or may require sorting and 
matching two sets of items. Digital technol- 
ogy simplifies the process of preparing, pre- 
senting and scoring quizzes, and can make 
them engaging and fun by adding imagery, 
animations and game-like success states. 

Flash cards that present the question on 
one side of the card and the answer on the 
other can also be simulated using technol- 
ogy. The learner types their answer. Through 
simple word or phrase matching, the learner’s 
answer is matched to the expected answer, and 
scored based on the level of match achieved. 
For more complex answers, some level of nat- 
ural language processing is required. 

Polls are particularly useful for an instruc- 
tor to assess, in real time, the current status of 


17 AAHE, >» https://www.oxy.edu/sites/default/files/ 
asset/AAHE9Principles.pdf 
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learner understanding in the classroom. The 
poll question, and multiple answers, are dis- 
played on the classroom screen. Each learner 
selects an answer on a smartphone or on a 
polling device. The poll responses are imme- 
diately collated and presented as a bar graph. 
If all or most of the learners select the cor- 
rect answer, the instructor can assume that the 
topic has been understood. If the responses 
are distributed over two or more answers, then 
the instructor can pause to review the topic 
and clear up learner misconceptions or lack 
of understanding. 


25.6.2 Branching Scenarios 


A branching scenario is a structured approach 
to assessment using simulation. A mini- 
scenario or a choice of data resources (such as 
laboratory tests) is presented at each branch 
point, and the learner chooses one out of a 
set of available decisions or responses. One or 
more of the decisions may be correct. Based 
on each consecutive decision, the learner 
moves through a branched scenario and 
achieves a final outcome to the scenario. The 
learner can be assessed on each decision or 
on the final outcome. If the same material is 
presented in a learning mode, the learner may 
receive feedback about each decision. 


25.6.3 Simulations 


Simulations for assessment may use stan- 
dardized patients (actors trained to represent 
patients), realistic interactive mannequins, 
on-screen simulations, or simulations pre- 
sented in virtual reality. In all cases, tech- 
nology can be used to track learner actions 
and to assess their performance (Ryall et al. 
2016). In all except standardized patients, this 
tracking is built into the simulation, and can 
be extracted and analyzed for performance 
reporting. These more complex, scenario- 
based simulations, differ from branching 


simulations in that a large number of decision 
choices are available to the learner at every 
moment. Thus the simulation is a more real- 
istic representation of a clinical situation but 
is also correspondingly more difficult to score 
for assessment (Dillon et al. 2002). 


25.6.4 Intelligent Tutoring, 


Guidance, Feedback 


Intelligent Tutoring Systems (ITS) differ from 
other technology-based learning systems in 
that they offer individualization of the learn- 
ing experience based on the learner’s per- 
formance while using the system (VanLehn 
2011). Because of their architecture, continu- 
ous assessment of the learner is essential to 
the operation of ITSs, with guidance provided 
as needed, placing ITSs in the domain of for- 
mative assessment. Typically, an ITS is built to 
replicate one-on-one, personalized tutoring. 

Modern ITSs use natural language for 
dialog between the learner and the tutor 
(0 Fig. 25.10). Conversational dialog is more 
likely to uncover learner misconceptions or 
gaps in knowledge. As the student responds 
to the tutor’s questions, the response is com- 
pared to the expected response using statistical 
methods that compare the conceptual similar- 
ity of the two pieces of text. An example of 
a conversational tutoring system is Autotutor 
(Graesser et al. 2004), which has been used for 
learning domains ranging from physics and 
mathematics to training nurses for mass casu- 
alty triage (Shubeck et al. 2016). 


25.6.5 Analytics 


Understanding and improving the process and 
outcome of education requires applying met- 
rics into many facets of the educational pro- 
cess. With digital technology, measurement 
and resulting data availability has increased 
steadily. At the same time, educational institu- 
tions and businesses are beginning to develop 
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O Fig. 25.10 Screenshot 
of a conversation with an 
intelligent simulated tutor. 
The learner, a first 
responder, converses with 
a tutor who uses natural 
language to guide a 
student to give detailed 
answers using their own 
words. (Courtesy of 
Innovation in Learning, 
Inc., with permission) 
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Black is deceased; grey is expectant; red is immediate; yellow is = 
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Se Í Based on what you just learned, explain the steps that take 
T place during patient assessment. 


with Priority 1 patients and conduct an initial patient 
assessment to identify potential life saving techniques 


Let me ask you another question: 

After lifesaving interventions are provided you need to prioritize 
patients for treatment based on assignment to one of 5 color-coded 
categories, what do each of these colors stand for? 


i, 
“| vhat should you be identifying when assessing patients? 


Afe saving interventions 


methods to unlock potential uses of this vast 
amount of data. 

A particularly desirable outcome is per- 
sonalizing learning to each learner’s needs. 
The many assessment methods described 
above can be applied to generate a profile of 
the learner’s current knowledge state and to 
create a detailed list of topics to be learned. 
To implement such an adaptive system, the 
content itself must itemized and tagged so 
that the learner’s state can be mapped to the 
desired learning goal state, and content items 
can be delivered in an appropriate sequence to 
achieve optimal learning. 

Data analytics can also be applied to indi- 
vidual courses, to identify topics most sought 
by students, and areas in which testing shows 
that students consistently fail. Such failure 
may indicate students’ lack of knowledge, but 
it could also provide a clue to areas in which 
teaching could be improved. 

At the institutional level, analytics is used 
extensively to match community and business 
needs to the design of degree and certificate 
programs by universities. Businesses also use 
similar analytic approaches to discover knowl- 
edge gaps among their workers, and to design 
programs to develop or upgrade worker skills 
as changes occur in their industries. 


25.7 Future Directions 
and Challenges 


As this chapter has shown, computers have 
played, and will continue to play, an increas- 
ingly important role in health sciences edu- 
cation. How will the rapid change and fluid 
nature of innovation influence how we use 
technology in education in the future? As we 
increasingly “digitize” almost all aspects of 
our lives, we can expect information technol- 
ogy to continue to weave itself more and more 
into the essential fabric of how we teach and 
learn. 

How can digital technology help advance 
teaching and learning? Most faculty have 
embraced, or at least accepted, technology’s 
growing role in education. Students often 
have higher expectations of technology use 
than most health sciences schools can fulfill. 
How computers can help improve education 
is a key question of interest to faculty and 
students alike. Faculty members are keenly 
interested in finding out how technology can 
help them become better teachers. Students 
want to know how computers can help them 
learn more efficiently and effectively. Current 
trends in digital learning indicate how some 
of these questions will be answered (Adams 
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Becker 2017).!8 The following are examples of 
some of the challenges that we can expect to 
encounter. 

= Digital content production and verification 


remains an ongoing challenge. Digital 
learning content can range from inexpen- 
sive recording and streaming the video of 
a single lecture to very expensive and time- 
consuming creation of arich and dynamic 
simulation of a disease process. Effective 
curation and distribution of high-quality 
content remains a challenge, with some 
healthcare faculty being reluctant to use 
content developed at other universities. An 
emerging trend that may increase use of 
existing content is to apply the methods of 
the “flipped classroom” to MOOC-based 
online courses. In this method, selected 
online content, from MOOCs or other 
sources, is assigned for study at home, and 
group time is used for instructor-led con- 
tent discussion and problem solving. Such 
approaches can combine the best of online 
content with the strengths of classroom 
teaching by faculty, and it is possible that 
such hybrid or blended classes will become 
increasingly common. 

Learning analytics is a direct outcome of 
digital learning content and learning man- 
agement systems. An immediate challenge 
is to utilize the available data to improve 
the healthcare education process at the 
level of the individual, the course, the cur- 
ricula, and the institution, and to match 
this education process to the needs of 
today’s healthcare. A more far-reaching 
challenge is to use data as evidence to 
understand what works and why. In par- 
ticular, we need to understand the best 
approaches to blended online and face-to- 
face learning, the uses of collaborative and 
project-based learning, and the role of 
simulations and experiential learning. 


Adams Becker S, Cummins, M, Davis A, Freeman 
A, Hall Giesinger C, and Ananthanarayanan V. 
(2017). NMC Horizon Report: 2017 Higher Educa- 
tion Edition. Austin, Texas: The New Media Con- 
sortium. > http://cdn.nme.org/media/2017-nmc- 
horizon-report-he-EN.pdf 


= Real-time feedback. Significant portions 


of pre-clinical training in healthcare 
require use of simulators. With embedded 
sensing and compute capability, and inter- 
net access, these simulators will become 
capable of real-time monitoring of learner 
performance. Display of this data on a 
performance dashboard will allow both 
learner and teacher to observe flaws in per- 
formance and for the teacher to provide 
appropriate guidance at the time it is 
needed. With the addition of intelligent 
tutors that are built into the simulator, the 
learner can receive needed feedback by 
using the simulator at any time of the day. 
Similarly, we can challenge ourselves to 
understand and implement intelligent, 
real-time feedback into all aspects of 
healthcare learning. 

Artificial intelligence and adaptive learning. 
Understanding and engaging with each 
student’s success at the course level is the 
domain of the individual faculty, and 
remains a challenge for the application of 
appropriate digital technology. 
Implementation of adaptive learning, that 
is, adapting the presentation of learning 
content in response to continuous assess- 
ment of learner performance, will be an 
essential next step. We can expect student 
performance to be tracked, and personal- 
ized exercises and assessments presented, 
so that they can understand their strengths 
and weaknesses, and can request digital or 
in-person help they need for success. 
Learning Management Systems will see sig- 
nificant evolution. Currently they are nar- 
rowly focused on the administration 
aspects of learning, ensuring that learners 
are aware of courses needed for their pro- 
gram, delivering course material with the 
appropriate sequence and timing, and 
checking when these courses have been 
completed. In the future, the challenge will 
be for LMSs to go beyond administration, 
and to support student learning. In partic- 
ular, for healthcare education, LMSs will 
be required to support mastery- and com- 
petency-based education, with detailed 
tracking of concept and skill acquisition. 
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The topics presented above are only a small 
selection of the interesting challenges in 
future healthcare education. Journals such as 
“Academic Medicine” and “Computers and 
Education”, and websites such as Educause. 
edu, periodically discuss these and other chal- 
lenges in more depth. 


25.8 Conclusion 


Digital learning is widespread in healthcare 
education and has proven to be both effective 
and engaging. Digital content ranges from 
basic web pages to highly immersive inter- 
active 3D virtual spaces. Digital support of 
learning uses learner tracking to assess per- 
formance and to advise and guide the learner 
towards optimal learning outcomes. Artificial 
intelligence and adaptive learning methods 
have the potential to personalize learning, and 
to provide the institution with detailed under- 
standing of how to support each learner as 
well as how to align educational approaches 
with institutional goals. Simulators, for hands- 
on procedures and for diagnosis and commu- 
nication, will provide a learning environment 
that parallels the student’s progress through 
the real clinical environment, providing safe, 
realistic practice before learners must use 
that knowledge on real patients. Virtual and 
augmented reality will make these simulated 
environments and tools appear and feel real- 
istic, while providing the content scaffolding 
and mentoring that may not be available in 
the real clinical environment. Next generation 
learning management systems will provide the 
administrative infrastructure to support the 
student as they progress through their educa- 
tional program, deliver personalized learning 
to each student, and offer detailed dashboard 
information to both faculty and institutional 
administration. 

To realize the full potential of digital 
learning, there must be significant invest- 
ment in further development of digital learn- 
ing technology and content. There must also 
be effort to develop faculty and staff so that 
they move beyond simply using technology to 
understanding how to make each technology 
elicit the desired learning outcome. 
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This is an exciting time in digital learning 
capabilities. It is an even more exciting time to 
solve the many challenges ahead so as to move 
towards high performance learning systems. 


(e) Suggested Readings 

Bligh, D. A. (2000). What’s the use of lectures? 
San Francisco: Jossey-Bass. In this book, the 
author analyzes the best use of the lecture as a 
teaching method, and what lectures fail to 
teach. 

Bransford, J. D., Brown, A. L., & Cocking, R. R. 
(2000). How people learn: Brain, mind experi- 
ence and school. Washington, DC: The 
National Academies Press. This National 
Research Council book synthesizes many find- 
ings on the science of learning, and explains 
how these insights can be applied to actual 
practice in teaching and learning. 

NMC Horizon Report: 2017 Higher Education 
Edition. Available at: https://library.educause. 
edu/resources/2018/8/2018-nmc-horizon- 
report. This annual report highlights issues, 
trends and technologies in education. 

Talbot, T. B., Sagae, K., John, B., & Rizzo, A. A. 
(2012). Sorting out the virtual patient: How to 
exploit artificial intelligence, game technology 
and sound educational practices to create 
engaging role-playing simulations. 
International Journal of (Gaming and 
Computer-Mediated Simulations, 4(3), 1-19. 
This paper is a good overview and analysis of 
the many methods of simulating a patient. 


Q Questions for Discussion 

1. In developing effective educational 
interventions, you are often faced with 
a choice of instructional methods. 
Which of the instructional methods 
listed below would best match the 
instructional goals listed? Please justify 
your selection. Note: For some 
instructional goals, more than one 
instructional method might be 
appropriate. 


Instructional Goal 

1. Be able to intubate an unconscious 
patient 

2. Memorize the terminology used in 
neuroanatomy 
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3. Recognize the symptoms of a patient 
with probable mental illness 

4. Describe the pathophysiological pro- 
cess of hypertension 

5. Detect histopathologic variations 
on histology slides 


Instructional Method 

1. Case-based scenarios that include 
video 

2. Physical simulation with computer- 
based feedback 

3. Didactic material that includes text, 
images and illustrations 

4. Intelligent tutoring system 

5. Drill-and-practice program 


You are developing a software 
application for  interprofessional 
education to teach participants about 
managing patients with advanced Type 
2 Diabetes. Your audience includes 
students representative of the clinicians 
who are typically involved in the care 
of such patients: primary care 
physicians, specialists such as 
ophthalmologists and podiatrists, and 
nurses. Your software application is 
focused on the care of individual 
patients, and you have put together a 
set of clinical case studies as a basis. 
How could you leverage current 
collaborative technologies to help the 
team manage each case in a way that 
resembles what they would do in real 
life? 

Select a topic in physiology with which 
you are familiar, such as arterial blood— 
gas exchange or filtration in the kidney, 
and construct a representation of the 
domain in terms of the concepts and 
sub-concepts that should be taught for 
that topic. Using this representation, 
design a teaching program using one of 
the following methods: (1) a didactic 
approach, (2) a simulation approach, or 
(3) a game approach. 

You are a junior faculty member at a 
major medical center and you just were 
appointed director for a course on clini- 
cal patient examinations. You decide to 


check out several sharing sites for curric- 
ular material, such as MedEdPORTAL, 
to try to find relevant teaching materials. 
What kind of issues/problems would you 
expect in integrating material from those 
sites in your course? 

5. As Chief of Quality Improvement at 
the Veterans Administration, you are 
attempting to improve fairly poor out- 
comes of patients with Post-traumatic 
Stress Disorder (PTSD). You would like 
to develop a computer-based educa- 
tional tool for patients and caregivers to 
help them cope with PTSD. Most of the 
patients and caregivers are quite unfa- 
miliar with the disorder, and health 
literacy varies widely in your target 
audience. In conceptualizing your 
approach, you are focused on the fol- 
lowing questions: 


(a) What are the instructional goals of 
the program? 

(b) What kind of digital content should 
you use? 

(c) How do you assess baseline 
knowledge of patients and 
caregivers about PTSD, and how 
do you measure knowledge gains 
after they have used the program? 
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Translational Bioinformatics 


© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= How does translational bioinformatics 
differ from the more general field of 
bioinformatics? 

= What do T1 and T2 refer to in the con- 
text of translational research? 

= What is a biomarker, and why is it 
important in medicine? 

= What is precision medicine, and how 
does it differ from traditional medical 
practice? 

= What is the difference between pharma- 
cokinetics and pharmacodynamics? 

= What is the difference between statisti- 
cal significance and clinical signifi- 
cance? 

= How are genomic data being used today 
in research, clinical care, and consumer 
health? 

= What are some ethical issues surround- 
ing genomic medicine? 

= How are ontologies useful in transla- 
tional bioinformatics? 


What Is Translational 
Bioinformatics? 


26.1 


> Chapter 9 described the field of bioinfor- 
matics, or the study of how information from 
biological systems is represented and ana- 
lyzed. Translational Bioinformatics (TBI) is 
bioinformatics applied to human health and 
disease. It uses and extends the concepts and 
methods from bioinformatics to facilitate the 
translation of biological (“bench”) discover- 
ies into actual impact on clinical care (“bed- 
side”) and ultimately on population health 
(0 Fig. 26.1). Translational bioinformatics 
lies at the intersection of bioinformatics and 
clinical informatics, applying informatics 
methods to increasingly voluminous omics 
data (genomics, transcriptomics, epigenom- 
ics, metabolomics, and proteomics data) to 
improve clinical care and health outcomes 
through the advancement and practice of pre- 
cision medicine (see > Chap. 28). In this chap- 
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ics on the “bench” side of the T1 barrier and health 
informatics on the “bedside” end of the spectrum. Novel 
methods for storage, analysis, and interpretation span 
the spectrum from data to knowledge. (Adapted from 
Sarkar et al. (2011). Creative Commons CC BY-ND 
License) 


ter, we describe key concepts and methods in 
TBI, summarize TBI data-related resources, 
and introduce the concept of precision medi- 
cine, which is enabled by TBI and covered in 
greater depth in > Chap. 28. We conclude 
with a discussion of challenges and future 
directions for the field. 


Differences 
from “Traditional” 
Bioinformatics 


26.1.1 


TBI differs from the larger field of bioinfor- 
matics in a number of key ways. As described 
above, the focus of TBI is human health. As 
such, the discipline centers primarily, though 
not exclusively, around human data. This 
fact has a number of implications from an 
informatics perspective. First, one encounters 
a range of data management, regulatory, and 
privacy issues that do not arise in handling 
data from model organisms such as mice, yeast, 
or Escherichia coli. Laws such as the Health 
Information Portability and Accountability 
Act (HIPAA)! (see ® Chap. 12) dictate how 


1 > http://aspe.hhs.gov/admnsimp/p1104191.htm 
(Accessed 30/11/2012). 


26 


870 J.D. Tenenbaum et al. 


patient data must be handled and safe- 
guarded to protect patient privacy. Title 21 
of the Code of Federal Regulations Part 11 
(21 CFR part 11)? mandates how data must 
be managed if they are to be included as 
part of a submission to the Food and Drug 
Administration. In addition, institutional 
review boards (IRBs) typically require mea- 
sures to ensure safety and confidentiality of 
human subjects before they will approve a 
research protocol. Making complete datasets 
publicly accessible for a mouse experiment is 
good scientific citizenship. Making the same 
type of data accessible for a human study, 
without approval, could be a serious violation 
of privacy and confidentiality. 

Another difference is that while experi- 
mental perturbation through small molecule 
agonists or antagonists, siRNA, or knock- 
out genes are straightforward and common 
in yeast or E. coli, such approaches would 
be neither feasible nor ethical in human sub- 
jects. This has significant implications for data 
generation and collection in translational 
research. Phase I clinical trials are the notable 
exception to this rule, but they are performed 
only on ostensibly therapeutic agents. They 
also require a number of preliminary steps, 
are very expensive, and are performed in a 
very small number of subjects. Other fac- 
tors that differentiate research with human 
subjects include genetic and environmental 
heterogeneity, which can be controlled in 
model organisms. Instead, much translational 
data from human beings comes from in vitro 
experiments on cell lines and observational 
inquiries regarding factors such as genotype, 
environmental factors, and outcomes. With so 
much inherent noise, very large sample sizes 
are typically required for new discoveries. 
Novel approaches to data integration, mining, 
and re-use are thus particularly important in 
translational research. 


2 » http://www.accessdata.fda.gov/scripts/ 
cdrh/cfdocs/cfcfr/cfrsearch.cfm?cfrpart=11 
(Accessed 30/11/2012). 


26.2 The Rise of Translational 
Bioinformatics 


Promise of the Human 
Genome Project 


26.2.1 


In January of 2000, two different groups 
announced that they had fully sequenced the 
human genome (see > Chap. 9). The public 
project, published in Nature, was based on 
multiple individuals (Lander et al. 2001). The 
other genome, published in Science, was a pri- 
vate venture, performed on the DNA of biol- 
ogist and entrepreneur Craig Venter (Venter 
et al. 2001). The vision for the human genome 
was that once all the genes were identified, 
they could be assigned functional annotations, 
and we would thus be able to understand what 
goes wrong when human beings succumb to 
disease. Additionally, this knowledge would 
help us to understand exactly which pathways 
and molecules needed to be targeted in order 
to prevent or cure disease. Of course, bio- 
logical reality is not quite so straightforward. 
To begin with, the “central dogma” of biol- 
ogy (Crick 1970)—DNA is transcribed into 
mRNA, which is then translated into pro- 
tein—is overly simplistic. Variations in regula- 
tory regions can affect when the gene is turned 
on, and to what degree. Most genes have a 
number of different splice variants, producing 
a number of different proteins. In addition, 
proteins undergo post-translational modifica- 
tions, which impact their structure and func- 
tion. Finally, additional complexity is added 
through epigenetics, or heritable traits that 
are not coded for through DNA sequencing 
alone. An example is methylation of the DNA 
molecule, which has been shown to affect 
transcription (Cedar 1988). Despite this, the 
sequencing of the complete human genome 
marked a decisive turning point in biomedical 
research. The parts list had been assembled 
and researchers could move on to the more 
interesting aspects of the genome— what each 
part does, how the parts differ among indi- 
viduals, and what it all means. The impact 
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O Fig. 26.2 Translational 
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continuum of biomedical 
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this would have on the field of medical infor- 
matics was recognized immediately, reflected 
in the theme for the 2002 AMIA? Annual 
Symposium: “Bio*Medical Informatics: One 
Discipline” (Tarczy-Hornoch 2007). 


26.2.2 What Is Translational 
Research? 


In the early 2000s, there was growing 
acknowledgement that the population at 
large, and patients in particular, were not 
reaping the full benefits of the considerable 
amount of research money being devoted to 
scientific discovery. It was recognized that 
researchers do not do a good job translating 
their discoveries “from bench to bedside,” 
i.e. bridging biological discoveries in the 
lab and clinical application of the findings 
(Lenfant 2003). Two significant roadblocks 
were initially identified (@ Fig. 26.2)—one 
in translating discoveries into clinical care 
guidelines (dubbed TI translation), and the 
other in translating clinical care guidelines 
into actual practice (T2 translation) (Sung 
et al. 2003). Additional “T’s” have been 
devised more recently, and definitions refined 
to be more granular, e.g. splitting out early 
and late phase trials and knowledge dissemi- 
nation vs. knowledge application (Waldman 
and Terzic 2010). In 2004, the National 
Institutes of Health (NIH) launched the 
Roadmap for Medical Research, aimed at 


3 AMIA is the American Medical Informatics Asso- 


ciation, Bethesda, MD: > http://www.amia.org 
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scientific discoveries 
into changes in 
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guidelines 


Adoption of new clinical 
pratice guidelines by 
providers, regulators, 

funders, etc. 


transforming life science research in the 
twenty-first century. Biomedical informat- 
ics plays a strong role across all three of the 
major Roadmap themes: New Pathways to 
Discovery, Research Teams of the Future, 
and Reengineering the Clinical Research 
Enterprise (Zerhouni 2006). As part of this 
Roadmap, the NIH embarked on a major ini- 
tiative to break down translational barriers 
through a new funding mechanism known as 
the Clinical and Translational Science Award 
(CTSA). This award was aimed at major aca- 
demic medical centers and their partners with 
the goal of improving translational research 
to get treatments to patients quickly. 

It was in this context, with newfound 
attention to translational research, that Butte 
and Chen coined the term “translational 
bioinformatics” at the AMIA annual sym- 
posium in 2006 in a paper entitled “Finding 
Disease-Related Genomic Experiments 
Within an International Repository: First 
Steps in Translational Bioinformatics” (Butte 
and Chen 2006). AMIA added TBI as one 
of its key supported domains and in 2008 
held its first annual Summit on Translational 
Bioinformatics. Later that year, the Journal of 
the American Medical Informatics Association 
(JAMIA) published a perspective on TBI’s 
“Coming of Age” that enumerated several 
reasons why the time was right for TBI to 
come into its own as a field (Butte 2008). In 
2009, the editors of the Journal of Biomedical 
Informatics published an explicit Editorial to 
announce a change in the journal’s editorial 
policy to “focus its bioinformatics attention 
on innovations in the area of translational bio- 
informatics” (Shortliffe et al. 2009). 


26 


872 J.D. Tenenbaum et al. 


26.2.3 Precision Medicine 
as a Driving Force 


26.3.1 Data Storage 


and Management 


Precision medicine (and its cousins: person- 
alized medicine, genomic medicine, stratified 
medicine, individualized medicine, etc.) is 
health care that is based on an individual’s 
unique clinical, genetic, omic, and environ- 
mental profile, in addition to his or her spe- 
cific values and preferences. In 2004, Lee 
Hood coined the term “P4 medicine”: predic- 
tive, personalized, preventive, and participa- 
tory (Weston and Hood 2004). Based on an 
individual’s specific risk factors, interven- 
tions or changes in lifestyle could be adopted 
before the person falls ill, improving quality 
of life and saving significant costs in health 
care spending. Armed with this individual- 
ized knowledge, patients would be empow- 
ered to play an active role in their own health 
and medical care. Quality medical care has 
never been one-size-fits-all; precision medi- 
cine acknowledges this fact and seeks to 
change the practice of clinical care accord- 
ingly. Precision Medicine is discussed further 
in > Chap. 28. 


26.3 Key Concepts for Translational 
Bioinformatics 


As noted in the definition above, TBI involves 
the development of novel methods for the 
storage, analysis, and interpretation of 
molecular data to guide clinical care. In this 
section we elaborate on these different levels 
of informatics methodologies which can be 
framed as falling along a spectrum from data 
to knowledge (B Fig. 26.1). Data represent 
specific values; at the simplest level, they can 
be reduced to ones and zeros. In the middle 
of the spectrum is information—ascribing new 
meaning to the data at hand through analysis. 
Finally, we arrive at knowledge—the ability 
to interpret information in a specific context, 
and for that interpretation to guide actions 
and behavior. 


Data are stored at a number of different lev- 
els corresponding to different stages along the 
translational pipeline. At the “bench” end of 
bench-to-bedside, there is the need to store 
massive files of raw data generated through 
omics technologies (Stein 2010). In the case 
of genome sequencing, these files can be so 
large that is has been suggested (though not 
necessarily concluded) that for easily regener- 
ated samples, it might be more cost-effective 
to discard the raw data and, if necessary, re- 
sequence at a later time (Hsi- Yang Fritz et al. 
2011). For each raw data file type, one can 
generally choose among several different pro- 
cessing tools or algorithms. Thus, in addition 
to the raw data, a researcher or core facility 
may want to store one or more versions of 
processed data files, still frequently very large 
in size. In addition to the actual data, experi- 
mental metadata are needed in order to under- 
stand how the data were generated and how 
they were processed or analyzed. Annotation 
facilitates both comprehension and data 
provenance. Unfortunately, that information 
is rarely standardized, and frequently stored 
only in the researcher’s head, paper notes, or 
hard drive. Standards and tools such as the 
Ontology of Biomedical Investigations (OBI) 
(Brinkman et al. 2010), Minimum Information 
lists (Taylor et al. 2008) and the Investigation/ 
Study/Assay (ISA) infrastructure (Rocca- 
Serra et al. 2010) (see > Chap. 9), have been 
developed to address this issue. Guidelines 
and best practices were formalized in the 
“FAIR” framework- making data Findable, 
Accessible, Interoperable, and Reusable 
(Wilkinson et al. 2016). A website called 
> biosharing.org, which had evolved out of 
the MIBBI initiative (Minimum Information 
for Biological and Biomedical Investigations), 
further evolved into » FAIRsharing.org.* 


4 > https://fairsharing.org/ (Accessed 10/22/2018). 
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This online resource contains a manually 
curated collection of data and metadata stan- 
dards, data repositories, and data sharing 
policies, as well as the relationships between 
these entities. For example, it includes a 
page for the ArrayExpress data repository, 
with a link to the page for the MINSEQE 
(Minimal Information about a high through- 
put SEQuencing Experiment) data standard it 
adopts, which links to the page for the Journal 
of Clinical Investigation by which that stan- 
dard is endorsed. 

For translational research purposes, there 
is also an increasing need to store informa- 
tion related to participant consent. As DNA 
Biobanks (described below) become more 
common, researchers will have greater access 
to tissue samples of participants who they did 
not themselves recruit. It will no longer suffice 
to have consent information stored on a paper 
form, locked away in a file drawer. Researchers 
and biobank administrators will need the 
ability to know to what each participant has 
consented, and to perform electronic que- 
ries to determine consent status on demand. 
May John Doe’s tissue be used for research 
beyond the study for which he was enrolled? 
May the blood collected as a byproduct of 
care be used for Genome-Wide Association 
Studies (GWAS)? May Jane Doe be contacted 
for enrollment in a follow-up study? In par- 
allel with work being done to address issues 
of ethics and governance for this type of data 
capture and management, researchers are 
working to develop tools and terminologies to 
facilitate research permissions management 
(Obeid et al. 2010; Grando and Schwab 2013). 
Researchers at the University of California 
San Diego created iCONCUR, a web-based 
informed consent tool to enable tiered prefer- 
ences for use of de-identified data (Kim et al. 
2017). 

At the bedside end of the translational 
spectrum, clinicians do not have the time, 
nor often the training, to analyze the under- 
lying data. They need easy access to what a 
patient’s genotype, protein biomarker pattern, 
or metabolite profile means, without having 
to wade through volumes of sequence and 
biomarker data. Even summary information 
about test results is not likely to be sufficient; 
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rather, the clinician needs to be provided with 
knowledge of what the test results mean for 
subsequent treatment decisions. They may 
also want to know some type of confidence 
or quality score for the data provided. HL7’s 
Clinical Genomics Workgroup is work- 
ing to develop an HL7 standard in this area 
based on HL7’s FHIR API (Alterovitz et al. 
2015). Incorporating omic data into the EHR 
(> Chap. 14) will not improve clinical care 
without the incorporation of these data types 
into clinical guidelines and tools for clinical 
decision support as discussed in ® Chap. 24 
(Hoffman 2007). 


26.3.2 Biomarkers 


Fundamentally, advancements in the abil- 
ity to analyze and interpret high-throughput 
molecular datasets advances the discovery of 
biomarkers. The term biomarker has been 
used for decades, referring to any observa- 
tion that could be used as an indication of an 
underlying physiological state. One commonly 
accepted definition is “a characteristic that is 
objectively measured and evaluated as an indi- 
cator of normal biological processes, patho- 
genic processes, or pharmacologic responses 
to a therapeutic intervention” (Atkinson 
et al. 2001). Exactly what constitutes a bio- 
marker has historically depended in part on 
what types of observations could be made. 
Early biomarkers would have included fever, 
increased respiratory rate, or a rash. As our 
ability to probe living organisms increased, the 
domain of biomarkers expanded to the pres- 
ence or concentration of specific molecules 
in the blood. For example, increased levels of 
glucose are indicative of diabetes. Omics-era 
methodologies give us new types of markers 
to which we can apply novel analytic methods 
to anticipate disease and monitor progression. 
In the genomic era, biomarkers may consist 
of not just one but many different character- 
istics, which together give insight into under- 
lying states or processes. Gene expression 
signatures are a common example of this type 
of multi-dimensional biomarker. 

One important distinction to be made 
is that of predictive versus mechanistic 
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biomarkers. Predictive biomarkers are essen- 
tially correlative markers of a given obser- 
vation or outcome. They may or may not be 
causal for that outcome, but they can assist 
both clinicians and researchers by anticipat- 
ing outcomes or suggesting new focus areas 
for research. Mechanistic biomarkers, on the 
other hand, can help shed light on what is 
happening at the molecular level that causes, 
for example, pathology, disease progression, 
or sensitivity to a given drug. Understanding 
a mechanism allows researchers to try to 
modify it through the activation or inhibition 
of specific molecules or pathways. 


= Predictive Biomarkers for Clinical Use 

Predictive biomarkers can facilitate decision 
making in a number of ways. A biomarker 
indicating poor prognosis might suggest a 
more aggressive course of therapy than if that 
biomarker were not present. A signature indi- 
cating that lifestyle changes are likely to offer 
significant benefit to a patient could provide 
the motivation needed to follow through. For 
example, a signature indicating that weight 
loss is likely to improve insulin resistance 
could identify individuals for whom an inten- 
sive lifestyle changes is likely to have the most 
impact. Shah et al. were able to identify a 
metabolomic profile in subjects who had lost 
weight that, while independent of the amount 
of weight lost, was correlated with changes 
in insulin resistance (Shah et al. 2009b). On 
the flip side, a signature indicating that life- 
style changes alone are unlikely to confer the 
desired benefits may suggest that pharma- 
ceutical intervention should be considered 
as well. Even if a biomarker is in no way 
actionable yet, it can be useful for biomedi- 
cal research. As an example, osteoarthritis is 
a debilitating disease that is treated primarily 
through palliative measures to alleviate symp- 
toms, but for which no disease-modifying 
therapeutic agents exist. One reason for this is 
the time and cost required to carry out a clini- 
cal trial. Without knowledge of which sub- 
jects are likely to progress, studies must enroll 
large numbers of participants in order to be 
significantly powered. Identifying biomarkers 
to predict progression would enable cohort 
enrichment for individuals in whom disease 


progression is more likely, thus cutting the 
total number of subjects required and hence 
the cost of the trial (Kraus et al. 2011). 


Biomarkers that are not clinically action- 
able may be personally actionable. For exam- 
ple, relapsing-remitting multiple sclerosis 
(RRMS) is a form of multiple sclerosis in 
which the patient experiences exacerbations 
or relapses of neurologic symptoms, followed 
by periods of partial or complete recovery. If 
a test could be developed to enable RRMS 
patients to know in advance if relapses were 
likely to occur within an upcoming span of 
weeks or months, it could enable them to 
make more informed personal or professional 
decisions, such as when to plan a vacation, or 
whether to take a new job (Gregory 2011). 

One major area for biomarker use is that 
of pharmacogenomics, described in Sect. 26.5 
below. In many cases, a therapeutic gold stan- 
dard exists, but only a fraction of patients 
respond to the given therapy. Knowing in 
advance who is likely not to respond to ther- 
apy, or who needs a higher or lower dose than 
the standard guidelines suggest, can be use- 
ful for tailoring therapeutic interventions. 
Interestingly, while the success of genetic 
biomarker discovery for common disease 
has been limited, genotypic biomarkers for 
response to drugs may be more promising 
because these variations would not have been 
selected against through evolution (Cirulli 
and Goldstein 2010). This may explain why, 
among published GWAS finding to date, the 
pharmacogenetic associations tend to have 
much higher odds ratios than those of genes 
associated with common diseases. 


a Molecular Mechanism for Therapeutic 
Targeting 

Biomarkers may also be used for elucidation 
of disease mechanism which can then enable 
therapeutic targeting toward a specific mole- 
cule or pathway. Comparative analysis of high 
dimensional molecular signatures in patients 
versus healthy volunteers, tumors versus nor- 
mal tissue, responders versus non-responders, 
etc., can reveal a set of molecules that are 
differentially expressed among these groups. 
One can then study those specific molecules 
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more closely, or the pathways in which those 
molecules are involved, for example through 
gene ontology (GO) enrichment (see Sect. 
26.6.2) or analysis using a curated pathway 
database such as Reactome (Fabregat et al. 
2018), Ingenuity’s IPA, or Thomson Reuter’s 
MetaCore (Nikolsky et al. 2005). These types 
of tools also help to address a major chal- 
lenge with pattern detection in high through- 
put data. Particularly in human data sets 
where differences are observational and not 
perturbation-based, it can be difficult, if not 
impossible, to know what is causal and what is 
simply correlated. Systems biology, described 
in > Chap. 9, attempts to address this. 


26.4 Biomarker Discovery 


One of the most common uses of biomarkers 
is to categorize samples or patients: cancerous 
samples versus normal tissues, good versus 
poor prognosis, bacterial versus viral infec- 
tion. There are a number of ways to approach 
this problem, all of which fall under the head- 
ing of supervised learning. Fundamentally, 
supervised learning entails taking a set of 
inputs and corresponding outputs to try to 
learn a model that will enable one to predict 
output when faced with a previously unseen 
input. One is trying to predict one value, the 
dependent variable, based on some number of 
other values, also called features (in computer 
science), independent variables (in statistics), 
or risk factors (in clinical practice). If the 
dependent variable is categorical, typically 
one is actually predicting the probability of 
belonging to one class or the other. For exam- 
ple, one might want to predict whether a per- 
son will have a heart attack based on age, race, 
gender, weight, and cholesterol level. Or, in 
the context of TBI, one might want to predict 
the likelihood of a heart attack based on gene 
expression. Note that this latter approach is 
useful only if the gene expression signature 
increases the predictive capabilities beyond 
that offered by the clinical variables, which 
are typically easier to collect. Algorithmic 
approaches to classification and prediction 
are described in > Chap. 9. 
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Clinical Relevance Versus 
Statistical Significance 


26.4.1 


Statistical significance (typically conveyed 
via p-values) quantifies whether a difference 
is reliably measurable via a test. With large 
datasets, most differences detected are sta- 
tistically significant in the sense that such a 
difference would not be due to just sampling 
variation. However, the presence of statisti- 
cal significance does not guarantee clinical 
relevance. Clinical relevance is a measure of 
how valuable information provided by the 
test is in guiding clinical care. It incorporates 
not only statistics, but also efficacy, safety, 
and cost. 

A test may be able to predict with 90% pre- 
cision whether, for example, a patient is likely 
to respond better to a treatment with unpleas- 
ant side effects over another, more innocuous 
therapy. However, if those side effects would 
significantly lower the patient’s quality of life, 
then the test, while statistically significant, 
may not be clinically relevant. Similarly, if 
the cost of a false negative is very high, for 
example if a test predicts with 90% precision 
that a patient will survive without a given 
intervention, that intervention will likely still 
be administered. On the other hand, if a test 
predicts with 90%, or even 100%, precision 
that a patient is likely to live 1 month longer 
with a given intervention but the intervention 
costs $1 million, this highly statistically sig- 
nificant test is still not likely to affect clinical 
care. Thus, incorporation of molecular data 
or improvement in an analytic method may 
make a test’s result statistically significant 
while still not affecting clinical practice. 

There are various ways to convey a test’s 
“accuracy”. The most common metric, which 
conveys the ability of a test to discriminate 
two classes, is measured by the Area Under the 
receiver operating characteristic (AUROC) 
curve (see > Chap. 3), or the C statistic. The 
ideal ROC curve goes straight up the y-axis 
at x = 0, and then straight across the x-axis 
at y = 1, giving an AUC of 1. The more reli- 
able a test, the closer it comes to that perfect 
path. @ Figure 26.3 shows hypothetical ROC 
curves for two tests. Test 2 is a more reliable 
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O Fig.26.3 A 
comparison between two 
Receiver Operator 
Characteristic (ROC) 
curves. The area under 
the curve (AUC) or C 
statistic is higher for Test 
2 (gray) than for Test 1 
(diagonal lines) to a 
statistically significant 
degree, but this increased 
accuracy does not 
necessarily imply clinical 
relevance 
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test in that it has a statistically significant 
higher C statistic, but as with the examples 
above, that may not change any clinical deci- 
sions. Much has been written about the limi- 
tations of the AUROC, which is not a good 
measure when the test needs to discriminate 
between two outcomes where one is very rare 
(Cook 2007). In such situations, the Area 
Under Precision Recall Curve (AUPRC) may 
be more meaningful. Instead of 1-specificity 
on the x-axis and sensitivity on the y-axis, 
AUPRC plots recall (which is the same as sen- 
sitivity) on the x-axis and precision (positive 
predictive value) on the y-axis. This means it 
is not skewed by the low absolute number of 
true positives. 

It has been proposed that a better measure 
than area under a curve is needed for judging 
the incremental value of novel biomarkers and 
analytical approaches (Pencina et al. 2008). 
Alternative methods include net reclassifica- 
tion improvement (NRI), a measure of the net 
fraction of reclassifications made in the cor- 
rect direction using the given biomarker or 
method over a method without the designated 
improvement (Steyerberg et al. 2011). This 
concept is illustrated in @ Table 26.1. Rows 


Test 1 


--—- Test 2 


0.4 0.6 0.8 1.0 
1 - Specificity 


O Table 26.1 Hypothetical reclassification of 


disease risk between two prognostic tests 


Number of individuals (actual rate) 


Predicted 5-year risk for 


test 2 
Predicted 0-5% 5-20% >20% 
5-year risk z 
for al 0-5% 300 20 0 
3%) (2%) 
5- 30 300 40 


20% (3%) (11%) (37%) 


>20% 0 10 300 
(35%) (42%) 


represent the risk level predicted by the hypo- 
thetical Test 1 for 1000 subjects, columns rep- 
resent the risk level predicted by Test 2. Values 
along the diagonal were predicted to have the 
same risk by both tests. Subjects in the black 
cells (30 + 40 = 70) were correctly reclassi- 
fied by Test 2 (i.e., the actual rate in paren- 
theses matches the appropriate risk category). 
Subjects in the light gray cells (10 + 20 = 30) 
were reclassified incorrectly. The resulting net 
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reclassification improvement is (70-30)/1000, 
or 4%. 

One final characteristic of a good test is its 
calibration, the extent to which a test correctly 
measures absolute risk. That is, do the risk 
values predicted by the test reflect the actual 
risk observed in the population. Calibration 
may differ across the population at different 
levels of predicted risk, which may in turn 
affect the test’s utility. 


26.4.2 Biomarkers for Drug 
Repurposing 


One very promising area for use of biomark- 
ers is in drug repurposing, or drug reposition- 
ing. That is, identifying existing drugs that 
may be useful for indications other than those 
for which they were initially approved. Doing 
so avoids early clinical trials for toxicity as 
those have already been performed. A num- 
ber of different approaches have been used to 
identify candidates for repositioning. In some 
cases, overlapping symptoms may suggest a 
potential match between one disease area and 
another. In other cases, empirical observation 
of unexpected positive effects may suggest 
alternative uses for a given drug. With omic- 
scale biomarker discovery, it is possible to use 
underlying molecular pathway signatures to 
suggest new uses for existing drugs. 

One of the prominent early examples of 
this approach came from the Broad Institute 
in the form of the “Connectivity Map,” a 
resource intended to enable researchers to 
identify functional connections between 
drugs, genes, and diseases (Lamb 2007). The 
general idea was to identify a gene expression 
signature in a state of interest, e.g. a disease, 
and then compare that signature to the gene 
expression patterns observed upon expo- 
sure to a number of different compounds. 
Correlated signatures suggested pathways that 
were similarly perturbed between a disease 
state and an intervention. More importantly, 
anti-correlated signatures suggested poten- 
tial utility for a given compound in trying to 
reverse the underlying molecular mechanisms 
of a given disease. A similar approach was 
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Drugs 


Disease gene profile 


Drug effects opposite 
to disease profile 


Drug effects similar 
to disease profile 


O Fig. 26.4 A computational approach to candidate 
selection for drug repurposing. Sirota et al. first gener- 
ated genomic signatures representing both diseases and 
drug exposure. For each disease signature, they com- 
pared it to the panel of drug signatures and assigned a 
drug-disease score based on profile similarity. Drugs 
whose pattern were most significantly dissimilar to the 
disease state were ranked as lead candidates to treat the 
disease of interest 


used by Sirota et al. to identify the anti-ulcer 
drug cimetidine as a candidate agent to treat 
lung adenocarcinoma. They were then able to 
validate this alternate use in vivo using an ani- 
mal model of the disease (Sirota et al. 2011). 
Their approach is illustrated in @ Fig. 26.4. 


26.4.3 Genomic Data Resources 


Fundamental to advancement in biomarker 
discovery are high-throughput genomic 
measurements that have been enabled since 
the human genome draft was published. 
Fortunately, the genomic community has 
been moving toward a culture of data shar- 
ing, NIH broadens genomic data-sharing 
policy (2014) making experimental data via 
publicly available data repositories (Kaye 
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et al. 2009, 2014). Resources for genomic 
data include: 


= Genetic variation 
According to the Policy for Sharing of Data 
Obtained in NIH Supported or Conducted 
Genome-Wide Association Studies (GWAS),° 
genotypic data must be deposited to the 
NIH database of Genotypes and Phenotypes 
(dbGaP). Genomic variation data is also 
available through a number of other online 
resources—see » Sect. 9.3 and (Sherry et al. 
2001; WTCCC 2007; Altshuler, et al. 2010). 
The HapMap projects catalog variation over a 
wide variety of ethnic populations, in order to 
define the occurrence and frequency of com- 
mon genetic variations (Rusk 2010). The 1000 
Genomes project is taking HapMap further to 
categorize the occurrence of more rare varia- 
tions (changes in single DNA bases, as well as 
insertions/deletions, segmental duplications, 
and larger scale inversions and translocations) 
(Via et al. 2010). There are also resources 
about copy number variations.® 
dbSNP—database of Single Nucleotide 
Polymorphisms is a publicly available cata- 
log of genome variation (Sherry et al. 2001). 
Contents primarily represent single nucleotide 
substitutions, but also include a small num- 
ber of other types of variation, for example 
microsatellite repeats and small insertions 
and deletions (Homerova et al. 2002). The 
PharmGKB resource specifically annotates 
genetic variations relevant to drug response 
(Altman 2007). The increase in exome and 
genome sequencing has led to powerful 
resources for assessing human genome varia- 
tion across diverse populations. Key resources 
include the Exome Aggregation Consortium 
(Lek et al. 2016), the Exome Sequencing 
Project (ESP) (Auer, et al. 2016) and oth- 
ers. There are also pharmacogenomics-spe- 
cific studies of key pharmacogenes, such 
as the Pharmacogenetic Research Network 


5 > http://grants.nih.gov/grants/guide/notice-files/ 
NOT-OD-07-088.html (Accessed 12/6/2012). 

6 > http://humanparalogy.gs.washington.edu/struc- 
turalvariation/general/intro.html (Accessed 
12/3/2012). 


Sequence Project (PGRNSeq) (Bush et al. 
2016). 


= Gene expression information 

The Gene Expression Omnibus (GEO) 
(Barrett et al. 2013) contains an extremely 
large and diverse collection of high through- 
put gene expression experiments which 
allow one to evaluate whether a disease (or 
drug exposure) leads to up- or down-regula- 
tion of gene expression (Edgar et al. 2002). 
Particularly useful examples of gene expres- 
sion for drug response are the Connectivity 
Map data set in which gene expression in 
response to 164 drugs was measured (Lamb 
et al. 2006). Similarly, the NCI 60 is a set of 
60 cancer cell lines that have been exposed to 
hundreds of drugs in order to determine their 
sensitivity (Ross et al. 2000). Other efforts 
have looked at genetic variations that corre- 
late with gene expression in order to associ- 
ate these genomic regions with the function 
of the correlated genes (Gamazon et al. 2010; 
Nicolae et al. 2010). ArrayExpress, developed 
by EMBL-EBI (European Molecular Biology 
Laboratory- European Bioinformatics 
Institute), is a European analog of GEO, con- 
taining microarray and sequencing data for 
functional genomics (Kolesnikov et al. 2015). 


= Gene associations 

The Genetic Association Database (Becker 
et al. 2004) provides curated information 
about the results of genetic association stud- 
ies, including those studies that relate genetic 
variation to variation in drug response. 
The Human Genome Mutation Database 
(HGMD) also provides this information in 
a highly curated form (Stenson et al. 2009, 
2017). dbGaP— database of Genotypes and 
Phenotypesis a resource to archive and distrib- 
ute information about the interaction between 
genotype and phenotype (Mailman et al. 
2007). The PharmGKB resource is devoted 
entirely to providing information about asso- 
ciations between human genetic variation and 
drug response phenotypes (Altman 2007). 
The GWAS Catalog (MacArthur et al. 2017) 
is a very useful database of genome wide asso- 
ciation study (GWAS) hits. ClinVar aggre- 
gates genomic variation and their relationship 
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to human phenotypes (Landrum et al. 2016). 
Finally, ClinGen (Clinical Genome Resource) 
represents a manually curated collection of 
genetic variants and their clinical relevance 
(Rehm, et al. 2015). 


= Genetic pathways 

Understanding drug action requires under- 
standing the pathways and networks of drug 
action and drug metabolism. The PharmGKB 
provides curated drug pathways for both 
drug action and drug metabolism, and has 
links to relevant external pathways created 
by the National Cancer Institute Pathway 
Interaction Database (Schaefer et al. 2009), 
Reactome (Joshi-Tope et al. 2005), and others. 


26.5 Pharmacogenomics 


Pharmacogenomics, a prominent subtype 
of biomarker discovery, is the study of how 
genes and genetic variation influence drug 
response. The primary challenges for pharma- 
cogenomics are to (1) identify the key genes 
that influence drug response, and (2) under- 
stand how specific variations in these genes 
modulate drug response. The term pharma- 
cogenetics generally refers to drug-gene rela- 
tionships that are dominated by a single gene, 
whereas the more general term refers to drug 
responses that result from a combination of 
interacting gene products. In this section, we 
use the word “gene” loosely to refer not only 
to the DNA coding regions for proteins and 
RNAs but also the protein and RNA prod- 
ucts themselves. In many cases, the gene-drug 
relationship is really a relationship between 
the drug and the gene’s protein product (or 
even its non-coding RNA product). 
Pharmacogenomics is a prototypical 
TBI activity because it involves clinical enti- 
ties such as drugs, diseases, and symptoms 
as well as molecular entities such as genes, 
proteins, DNA, RNA, and small molecules. 
Because drug response is the key phenotype 
of interest, it is useful to review the basis for 
drug response. When a drug is administered, 
there are two distinct genetic “programs” that 
are relevant. The first is the pharmacokinetic 
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program or PK, which describes the absorp- 
tion, distribution, metabolism and excretion 
of the drug in the body. Genes implement 
this program (they encode transporter mol- 
ecules that move the drug across membranes 
and the liver enzymes that transform the drug 
and prepare it for elimination via the kidney 
or liver) and variation in these genes can lead 
to a different blood level of drugs or a differ- 
ent timing of these levels. The second is the 
pharmacodynamics program or PD, which 
describes how the drug works, its protein tar- 
get, and the mechanism by which it impacts 
cellular physiology in order to alleviate or 
cure disease. Genes are clearly also involved 
in this program (they encode the drug’s pri- 
mary targets, and the other proteins that 
interact with these targets to create the cel- 
lular response to the drug), and variation in 
these genes can lead to a different response to 
the drug. In short, PK is “what the body does 
to the drug”, and PD is “what the drug does 
to the body.” The goal of pharmacogenomics 
is to understand, for every drug, the key PK 
and PD genes, and which variation impacts 
their response. This will allow us to realize the 
vision of using the genome to choose drugs 
based on maximizing their likely efficacy and 
minimizing their likely toxicity. 


26.5.1 Key Entities and Associated 


Data Resources 


The key computational entities in pharma- 
cogenomics are genes, drugs, and drug-related 
phenotypes (indications and effects, includ- 
ing side effects). There exist good informatics 
resources for all of these: 


= Genes 

These are typically specified using the Human 
Genome Nomenclature Committee (HGNC) 
standard (Seal et al. 2011). They are typically 
situated within the genome as a series of exons 
that are spliced together to create a mature 
mRNA transcript that is then translated into 
a protein. This basic concept is made more 
complex because the strategy for splicing the 
exons may be variable (alternative splicing) 
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thereby leading to several proteins, the RNA 
transcript may be degraded before it is trans- 
lated, and the proteins may be modified after 
they are created. There are many resources 
on the web for gene information, and many 


A 
x PHARMGKB 


VKORC1 


aggregators of this information. These create 
a remarkably powerful network of associa- 
tions that can be used creatively to make new 
associations. For example, B Fig. 26.5 shows 
the links on PharmGKB (Pharmacogenomics 


Q ZMenu @ Help 


Overview 


PGx Prescribing Info 


Drug Labels 


Clinical Annotations 


Variant Annotations 


Haplotypes 


Literature 


Pathways 


Related To 


Links & Downloads 


ENSG00000167397 


Go 

GO:0005783 
GO.0016020 
GO.0016021 
GO.0016491 
GO.0047057 


Gene Ontology 

endoplasmic reticulum (GO:0005783) 
endoplasmic reticulum membrane (GO.0005789) 
integral to membrane (GO:0016021) 

membrane (GO.0016020) 

microsome (GO:0005792) 

oxidation reduction (GO 0055114) 
oxidoreductase activity (GO:0016491) 

positive regulation of coagulation (GO.0050820) 
regulation of blood coagulation (GO:0030193) 
response to antibiotic (GO:0046677) 

response to organic cyclic substance (GO-0014070) 
response to organic nitrogen (GO:0010243) 
vitamin K metabolic process (GO:0042373) 


vitamin-K-epoxide reductase (warfarin-insensitive) activity 


(GO:0047058) 


(GO:0047057) 


GeneCard 
VKORC1 


Genetic Testing Registry 
VKORCi(sym] 


RefSeq DNA 
NG_011564 
NT_010393 


RefSeq Protein 
NP_076869 
NP_996560 


RefSeq RNA 
NM_024006 
NM_206824 


UniProtKB 
AGNIO6_HUMAN (A6NIO6) 
VKOR1_HUMAN (098086) 


wiedgements 


O Fig. 26.5 PharmGKB gene pages are organized by 
tabs and the “Downloads/LinkOuts” tab shown here 


has links to many other sites with valuable information 


about human genes. (Copyright PharmGKB, used with 
permission from PharmGKB and Stanford University) 
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Knowledge Base) for the drug VKORCI and 

includes links to: 

= Entrez Gene: summarizes the sequence, 
variations, homologs across species 

= OMIM: provides information about rare 
genetic diseases involving this gene 

= UniProt: provides mapping information to 
relate this gene to its protein products 

= GeneCards: provides aggregated informa- 
tion about function, tissue localization, 
expression levels, literature references, and 
more 


= Drugs and small molecules 

RxNorm is a terminology standard for speci- 
fying drugs (Parrish et al. 2006). DrugBank 
provides information about drugs, their tar- 
gets, pharmacology, uses, and many other 
characteristics (Knox et al. 2011). There 
are only around 2000 approved drugs in the 
United States, and so this list is relatively short. 
The list of small molecules that are not drugs 
is much larger and includes the contents of 
PubChem (Wang et al. 2009, 2010), an NIH- 
built resource with basic information about 
the structure, function and literature on small 
molecules. The Zinc Database (Irwin and 
Shoichet 2005) lists 13 million commercially 
available compounds that can be purchased 
for use in research. Much drug information 
is contained within the “package insert” that 
is included in most drug packaging. This is 
information created by the drug company, but 
approved by the FDA. The FDA makes these 
available on a drug information site called 
DailyMed. For patients, the National Library 
of Medicine’s MedlinePlus resource provides 
basic drug information as well. 


= Drug indications and drug effects 

Drugs are used to treat particular diseases, 
and so controlled terminologies of drug indi- 
cations and drug effects are useful for com- 
putational efforts. At the organism level, 
indications and effects are often diseases (dia- 
betes is an indication) or side effects (hyper- 
glycemia is a side effect). The UMLS and 
MeSH terminologies are often used to charac- 
terize such disease phenotypes (Bodenreider 
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2004). Of course, other disease terminologies 
such as SNOMED are also useful (Spackman 
et al. 1997). For side effects, there are special- 
ized terminologies, including the MedDRA 
terminology used by the FDA in adverse 
event reporting (MedDRA replaced a pre- 
vious terminology called COSTART), and 
the WHOART (World Health Organization 
Adverse Reactions Terminology) diction- 
ary for adverse reactions (Brown et al. 1999; 
Alecu et al. 2006). The SIDER database 
(Kuhn et al. 2010) provides information 
mined from drug package inserts about drug 
indications and side effects. The Anatomic 
Therapeutic Chemical classification (ATC) 
from the World Health Organization provides 
a high level classification of drugs organized 
hierarchically by the anatomical location of 
target, the therapeutic category, the pharma- 
cological subgroup, chemical subgroup and 
precise chemical substance (Miller and Britt 
1995). 


a Pharmacological properties of drugs 
There are resources on the web that pro- 
vide molecular level assay data related to 
small molecules, including many drugs. The 
ChEMBLdb resource provides the ability 
to find targets, binding affinities, inhibition 
concentrations and information about other 
drug-oriented assays (Overington 2009). 
BindingDB also provides binding affinities for 
small molecules and proteins (Liu et al. 2007). 
In addition, RxNorm includes an RxClass 
API for accessing drug classes and members 
of a given class (Bahr et al. 2017). 


= Population-based data on drug effects 

The FDA maintains information about all 
reports of adverse events in the FDA Adverse 
Event Reporting System (FDA AERS). These 
reports include demographic information, 
indications for treatment, drugs administered, 
side effects experienced and a summary of 
clinical outcomes. They are freely available at 
the FDA website. The Canadian equivalent 
system, also making data freely available, is 
available through the Health Canada web- 
site. These data are very noisy and have many 
confounding variables, but nonetheless can 
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be useful for discovering “signals” suggesting 
dangerous side effects or drug-drug interac- 
tions (Tatonetti et al. 201 1b). 


26.5.2 TBI Applications 
in Pharmacogenomics 


The network of data described above is a rich 
potential source of hypotheses about how 
genes combine to create drug response, as well 
as for predicting the particular consequences 
of genetic variation. This is still a new field, 
and there remain many opportunities for 
innovative use of these data. We highlight a 
few here to illustrate how integration of data 
can lead to novel discoveries. 


=m GWAS to Discover Drug Response Genes 

The most straightforward way to associ- 
ate genes with drug response is to perform 
a genome-wide association study (GWAS) 
in which two groups are compared. (See 
> Chap. 28 for more on GWAS.) One group 
(cases) has a drug response of interest (e.g., 
an adverse event in response to the drug or 
a particularly good response to it) and the 
other group (controls) does not have the drug 
response of interest. It is critical to ensure 
that the phenotype or response is carefully 
defined and measured. With each group, 
DNA is collected and typically 500,000 or 
1000,000 SNPs (single nucleotide polymor- 
phisms) are measured using microarray tech- 
nology. Then, for each SNP, an association is 
measured between the genotypes in cases and 
controls and the response of interest using a 
simple statistical test such as the chi-squared 
test. The SNPs that are most highly associated 
may represent regions of the genome that are 
involved in the response. These must be care- 
fully vetted statistically, as there are many 
potential confounding variables. For example, 
it is important that the cases and controls are 
drawn from populations with similar ethnic 
origin, and that the significance remains after 
correcting for multiple testing. When one tests 
500,000 or 1000,000 hypotheses, adjustments 
such as Bonferoni correction (see > Chap. 24) 
must be made in order to take into account 


the chance that an association is spurious. If 
the result is real, then the SNP may be used to 
identify nearby genes in the region that may 
be important for the drug response. For exam- 
ple, Shuldiner and colleagues were interested 
in the ability of the drug clopidogrel to pro- 
tect patients from cardiovascular events. They 
found that a polymorphism RS12777823 
was associated with a high likelihood of hav- 
ing a cardiovascular event. They noted that 
this SNP was very close to the metaboliz- 
ing enzyme CYP2C19, and in particular the 
“risk” allele for this SNP co-occurred with the 
CYP2C19*2 variant. Thus, they showed that 
CYP2C19 is important for the desired effect 
of clopidogrel, and found a variation of this 
gene that predicted poor response to the drug 
in affected patients (Shuldiner et al. 2009). 


= Mining the FDA AERS to Find Drug-Drug 
Interactions 

The FDA Adverse Events database associ- 
ates multiple drugs with multiple diseases as 
indications as well as side effects. This data- 
base shows promise as a way to find new 
associations between single drugs and their 
side effects, as well as multiple drugs and 
their side effects. As mentioned above, the 
SIDER database is a “top down” database of 
side effects derived from the package label of 
drugs. Another approach to getting good lists 
of side effects is amore data-driven approach. 
One way to do this is to look for patterns of 
side effects associated with certain types of 
drugs using machine learning. For example, 
one may analyze the side effects of drugs that 
alter glucose in order to create a signature of 
the “typical” profile of side effects associated 
with a glucose-altering drug. Then, one can 
search a database of side effects (such as FDA 
AERS) for other drugs that match this profile. 
This was done by Tatonetti et al., who cre- 
ated a profile for glucose-altering drugs and 
found a set of 10 side effects either enriched 
or deficient (compared to background) in 
these drugs: hyperglycemia, diarrhea, hypo- 
glycemia, and pain were higher than others, 
and paresthesia, nausea, pyrexia, abdominal 
pain, and anorexia were less likely than others 
(Tatonetti et al. 2011a). Using this pattern, 
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more than 93% of drugs that are known to 
alter glucose could be recovered. More inter- 
estingly, however, this pattern could be applied 
to patients on pairs of drugs to search for 
pairs that altered glucose. A highly correlated 
combination was the antidepressant parox- 
etine and the cholesterol medication pravas- 
tatin (Tatonetti et al. 2011a). In subsequent 
validation in three independent EHR systems, 
large increases in glucose were observed in 
patients on these two drugs, and in mouse 
studies of these two drugs, glucose was sub- 
stantially increased. Thus, the adverse event 
patterns could be used to create patterns and 
detect new signals, not specifically reported 
in the database, but implied by the pattern of 
other side effects observed. 


= Mining the Literature to Build a Database 
of Gene-Drug Associations 

Biomedical text can also be an important 
source of high quality information about the 
relationships among genes, drugs, and diseases 
(Garten et al. 2010). High fidelity natural lan- 
guage processing techniques (> Chap. 7) can 
be used to extract information about gene- 
drug interactions. In some cases, the associa- 
tion between genes and drugs can be inferred 
simply by their co-occurrence in sentences 
(Garten and Altman 2009). In these cases, 
however, there can be many false positives 
due to sentences in which genes and drugs 
are mentioned, but are not actually interact- 
ing. A more precise method is based on care- 
ful parsing of sentences to find subjects and 
objects that are genes and drugs, and which 
are related by verbs that connect them (e.g. 
“CYP2D metabolizes codeine” has CYP2D 
as the subject, codeine as the object, and the 
verb “metabolizes” establishes their relation- 
ship) (Coulet et al. 2010). The rate of false 
positives is reduced in this case because more 
strict criteria are applied before claiming a 
relationship. These high quality interactions 
can be chained together to infer new knowl- 
edge. For example, drug-drug interactions 
often occur because two drugs share a com- 
mon metabolizing gene and that gene becomes 
saturated in the presence of both drugs, and 
cannot adequately metabolize both of them. 
Thus, the observations that “CYP2D metab- 
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olizes codeine” and “CYP2D metabolizes 
metoprolol” might be combined to infer that 
codeine and metoprolol have a potential drug- 
drug interaction. There are a large number of 
similar inferences that could be drawn about 
the relationships between genes, drugs and 
diseases given a high quality database of pair- 
wise interactions drawn from the published 
literature. Of course, some pairwise interac- 
tions may be incorrect, and so evidence for 
interactions should be combined from sev- 
eral sources (including EMR validation, for 
example) and once predictions are made, they 
should be embraced only with skepticism. 
A comprehensive text-based search for rela- 
tionships between genes, drugs and diseases 
over all PubMED abstracts yielded a publicly 
available database of more than 2 million 
high quality associations (Percha and Altman 
2015, 2018). 


= Using Drug-Target Interactions to Predict 
New Ones 

Another way to find new uses for old drugs is 
to predict interactions between drugs and new 
potential targets. Many drugs are designed 
to interact with a single target based on a 
detailed understanding of disease pathology. 
Once the drugs are administered, however, 
they may not bind only the original target, 
but they may unexpectedly have effects based 
on their binding to other targets. Most com- 
monly, these “off target” effects are consid- 
ered side effects and are avoided. In some 
cases, however, the “off target” effect may be 
beneficial in the setting of some other disease. 
Thus, both for explaining the molecular basis 
of side effects and for finding new molecu- 
lar evidence for beneficial novel effects, it 
is useful to connect drugs to proteins. One 
way to do this is to build computational and 
visualization methods for docking a 3D rep- 
resentation of a small molecule into the 3D 
structure of a target protein. This can be very 
successful, and has led to the hypothesis that 
a Parkinson’s disease drug may treat tubercu- 
losis (Kinnings et al. 2009)! In that case, the 
3D structure of a tuberculosis protein had 
a pocket that appeared to have high bind- 
ing potential to a known Parkinson’s disease 
drug, and thus the hypothesis arose that the 
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Parkinson’s drug might inhibit TB growth. 
These structure-based methods are powerful 
but limited because we have the 3D structure 
of only a subset of human proteins. Another 
approach, therefore, is based on looking for 
similarities in the list of drugs that have been 
shown experimentally to bind a protein. In this 
case, all that is needed are data from chemical 
assays showing which drugs bind which pro- 
teins. These are routinely collected in large 
screening experiments, and are available at the 
ChEMBL resource (Heikamp and Bajorath 
2011), for example. Given two proteins with 
two lists of interacting drugs, we can compare 
the list of drugs to look for commonalities. If 
there are many commonalities between pro- 
tein A and protein B, then one might conclude 
that the drugs that bind protein A may also 
bind protein B. This was the approach taken 
in the Similarity Ensemble Approach (SEA) 
where the list of drugs binding two proteins 
are compared using a measure of chemi- 
cal similarity (Keiser et al. 2009). When the 
chemicals on the two lists are statistically sim- 
ilar (more than would be expected by chance), 
then the SEA method predicts cross-binding 
of ligands for the two structures. When this 
was applied to a large set of proteins, the 
authors found that the antidepressant fluox- 
etine (Prozac) had high potential binding to 
the beta-adrenergic receptor, and this was 
found experimentally to block the beta-1 
receptor—demonstrating that Prozac is a type 
of beta-blocker! 


= Identifying Drug Targets Using Side-Effect 
Similarity 
A critical goal in pharmacogenomics is to 
associate drugs with their target proteins (and 
thus their coding genes) in order to know 
where to look for variation that may affect 
drug response. Determining drug targets can 
involve a difficult and lengthy experimental 
program. Thus, it would be very useful to 
have computational methods for determin- 
ing targets. One way to do this is to associate 
drugs to their side effects, and to look for side 
effect profiles that are similar across drugs. If 
one drug has a known target, and if another 
drug has a similar pattern of side effects, then 
the two drugs may share that target. This is 


based on the assumption that side effects 
arise from a few common mechanisms, and so 
genes involved in this mechanism may be tar- 
geted by multiple drugs or drug classes. In one 
study, Campillos et al. showed that they could 
create 1018 drug-drug relationships based on 
shared side effects (Campillos et al. 2008). The 
side effects were taken from the SIDER data- 
base, and the drugs came from a list of 746 
marketed drugs. Twenty of these drug-drug 
relationships were tested experimentally, and 
13 of them were shown to bind common tar- 
gets. Thus, a relatively straightforward asso- 
ciation of drugs based on side effects allowed 
the definition of molecular targets. In related 
work, Hansen et al. showed that genes could 
be ranked by their likelihood of interacting 
with a drug based on looking at the degree 
of similarity between chemical structure and 
indications-of-use between the query drugs, 
and small molecules known to interact with 
the gene products and their close protein 
interaction neighbors (Hansen et al. 2009). 
The pattern of predicted binding of a small 
molecule to protein off-targets can also yield 
information about the likely side effect profile 
for that molecule (Liu and Altman 2015). 
The examples we have discussed have 
several common features: they deal with the 
basic objects of diseases, drugs, and disease 
or adverse-event phenotypes; they integrate 
at least two sources of data to establish new 
relationships between these basic objects; and 
they connect clinical entities (drugs and dis- 
eases or adverse events) to molecular entities. 
Such examples represent only a small subset 
of the types of questions that can be asked 
with these valuable datasets. The key techni- 
cal challenges are typically (1) finding ade- 
quate gold standards (> Chap. 2) to evaluate 
the success of methods before applying them 
for novel discoveries; (2) understanding the 
sources of error and bias so that predictions 
are as reliable as possible; (3) designing careful 
statistical tests to ensure that the scoring and 
estimates of significance are accurate and use- 
ful (minimizing false positives, in particular); 
and (4) identifying and engaging experimen- 
tal collaborators who can, when appropriate, 
test the predictions that are made in human or 
model systems. Recently, it has become clear 
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that despite their shortcomings, EHRs can 
be extremely useful for initial validation of 
hypotheses about connections between drugs 
and adverse events (Tatonetti et al. 2011a). 
Gene-drug associations are typically tested in 
model systems with genes altered in order to 
reduce or eliminate their normal function, or 
by looking for covariation in human subjects. 


26.5.3 Challenges 
for Pharmacogenomics 


= Target Expansion: Molecules to Networks 
The emerging field of systems pharmacology 
is abandoning the view of “one drug, one tar- 
get” and moving instead toward a view that 
“the network is the target.” That is, the larger 
network of interacting genes is targeted by a 
drug at several points, and thus the systemic 
effects of drugs need to be evaluated in order 
to understand better the molecular under- 
pinnings of drug response. The challenges 
to systems pharmacology are similar to the 
challenges to the more general systems biol- 
ogy: defining the network topology and key 
players, creating ways to measure parameters, 
modeling nonlinear responses, and under- 
standing how variation in the basic molecular 
players impacts the resulting phenotype—in 
this case drug response phenotypes. 


= Rare Variants 

As whole genome sequencing increasingly 
provides data about rare variants, the para- 
digm of looking for common genetic varia- 
tion that explains variation in drug response 
will need to be modified. There may be cases 
when variation in drug response is explained 
by multiple rare variants rather than one or 
a few common variants. This is particularly 
challenging because there will often be insuf- 
ficient statistics to evaluate rare variants. In 
some cases, huge population-based studies 
may provide enough samples, but in other 
cases even these large cohorts will not have 
sufficient examples of any rare variant to 
allow statistical validation. In those cases, 
we will have to rely on computational tech- 
niques to assess the significance of very rare 
variations. 
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= Computational Methods to Leverage Stem 
Cell-Based Model Systems and CRISPr 
Assays 

The rise in the use of stem cells will create 
opportunities for combining direct measure- 
ments of cellular response to drugs with 
systems models of response, whole genome 
variation, and epigenetic information. As we 
perfect methods for creating induced pluripo- 
tent stem cells and differentiating them into 
the target tissues, we will be in a position to 
measure the response to drugs directly on 
these cells, with identical genetic and per- 
haps epigenetic backgrounds. Computational 
methods for analyzing these responses and 
relating them to the expected response in the 
patients from whom these cells are derived 
will be a major challenge in the years ahead. 
Similarly, the increased availability and ease 
of generating genome wide CRISPr libraries 
that allow genes to be knocked out alone and 
in combination promises to usher in an era of 
unprecedented data about the effects of genes 
on drug response (Kweon and Kim 2018). 


26.6 Ontologies for Translational 
Research 


In order to apply computational methods for 
biomarker discovery, one needs a consistent 
way to refer to genes, diseases, drugs, devices, 
etc. Several ontologies exist in the biomedical 
domain, many under active development, that 
provide the necessary terms for creating consis- 
tent annotations—preferably in an automated 
manner—for the various datasets that are at 
the core of conducting research in TBI. One 
primary need in TBI is to identify and refer 
unambiguously to diseases using one or more 
disease ontologies. We use the term disease 
ontology to refer to artifacts—terminologies 
and vocabularies as well as true ontologies— 
that can provide a hierarchy of parent-child 
terms for disease conditions. Disease-specific 
and other clinically-oriented ontologies are 
discussed in detail in » Chap. 7. 

The Ontology for Biomedical Investigations 
(OBI) was developed as a collaboration among 
a number of experimental communities 
around the world in order to represent com- 
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mon aspects of biological and clinical inves- 
tigations. It includes broadly applicable terms 
such as assay, as well as more specific terms, 
such as transcription profiling by array assay. It 
is particularly useful for annotation of experi- 
mental metadata, for example to record that 
a protein expression profiling assay was per- 
formed on a blood specimen (Brinkman et al. 
2010). 


26.6.1 Ontology-Related Resources 


for Translational Scientists 


The use of ontology-based analyses for TBI, 
especially disease and drug ontologies as well 
as analyses using multiple ontologies, is a 
recent development and the adoption and use 
of ontologies is likely to accelerate. Several 
resources are available for researchers who 
wish to use ontologies in making sense of 
large-scale datasets. The UMLS, or Unified 
Medical Language System (see » Chaps. 2 
and 7), is a set of files and software that brings 
together many health and biomedical vocabu- 
laries and standards to enable interoperabil- 
ity among computer systems. The UMLS has 
many uses, including search engine retrieval, 
data mining, public health statistics report- 
ing, and terminology research. In the field of 
TBI, the UMLS is a relatively underutilized 
resource, but that is changing with the increase 
in the variety of access options (Aronson 
2001; Bodenreider 2004; Aronson et al. 2008; 
Shah and Musen 2008; Aronson and Lang 
2010; Mork et al. 2010) and heightened dis- 
semination efforts by the National Library of 
Medicine. 

The National Center for Biomedical 
Ontology maintains a repository of biomedi- 
cal ontologies called BioPortal (Musen, et al. 
2011) which provides access through both 
Web pages and Web Services to more than 
600 biomedical ontologies and controlled 
terminologies. Users go to the BioPortal Web 
site to browse biomedical ontologies and to 
search for specific ontologies relevant to their 
work. BioPortal also provides tools such as 
the Ontology Recommender (Jonquet et al. 
2010), which takes as input representative tex- 


tual data relevant to a domain of interest and 
returns as output an ordered list of ontologies 
that would be most appropriate for annotating 
the corresponding text. By browsing ontolo- 
gies on BioPortal and using tools such as the 
ontology recommender, a cancer biologist 
may find, for example, that although the Gene 
Ontology offers some terms for annotating 
her experimental data related to cell division, 
there are more precise terms in the NCIt. She 
may discover that the Foundational Model 
of Anatomy ontology provides terms for 
consistently naming body parts from which 
the experimental specimens were obtained, 
or that the National Drug File — Reference 
Terminology (NDF-RT) provides the prop- 
erties of the drugs used in generating the 
experimental data. BioPortal allows users to 
navigate ontologies using a tree browser or 
visualize ontologies as a graph that offer cog- 
nitive support for understanding the complex- 
ities of large ontologies (@ Fig. 26.6). 

To provide the relationships between terms 
in two different ontologies, BioPortal provides 
mappings between the terms (Ghazvinian 
et al. 2009). The mappings can inform the user 
that the term Melanoma in the NCI Thesaurus 
is related to the term Malignant, Melanoma 
in SNOMED-CT and to Melanoma in the 
Human Disease Ontology. These mappings 
allow users to compare the use of related 
terms in different ontologies and to analyze 
how whole ontologies compare with one 
another (Ghazvinian et al. 2011). In addition 
to curated mappings from the UMLS metath- 
esaurus, BioPortal enables algorithmic and 
user-generated mappings as well. 


26.6.2 Enrichment Analysis 


Enrichment analysis is a statistical method 
to determine whether, for a set of items, a 
given concept or value is statistically over- 
represented compared to what one would 
expect by chance. For example, informatics- 
related terms are over-represented in this 
book compared to what one would expect to 
find in a random sampling of words from all 
textbooks. The canonical example of enrich- 
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O Fig. 26.6 A portion of the National Cancer Insti- 
tute’s thesaurus. The left pane shows a standard tree 
view for the term ‘Melanoma’. The right pane shows a 
visualization that provides additional context by show- 
ing the parent classes of melanoma, all the way to the 


ment analysis involves a list of genes dif- 
ferentially expressed in some condition. To 
determine the biological meaning of such a 
list, the usual solution is to perform enrich- 
ment analysis with the GO (Gene Ontology), 
which provides terms for consistent nam- 
ing of the cellular component (CC) of gene 
products, the molecular functions (MF) they 
carry out, and the biological processes (BP) in 
which they participate. Several curation proj- 
ects use GO terms to annotate gene products 
from multiple organisms with terms from the 
three branches (CC, MF, BP) (Camon et al. 
2003). These annotations form the basis for 
enrichment analysis in which we can aggregate 
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root node of ‘Disease, Disorder or Finding’. The navi- 
gation bar just above the graphical visualization pro- 
vides access to additional information, such as mappings 
which provide hooks into other disease ontologies that 
contain the concept Melanoma 


the annotating GO concepts for each gene in 
this list, and arrive at a profile of the biologi- 
cal processes or mechanisms affected by the 
condition under study. This approach does 
have certain limitations, for example incom- 
plete annotations for a number of genes, 
lack of conditional independence between 
annotations, sensitivity to GO version, and 
lack of a systematic mechanism to compen- 
sate for differing levels of depths in different 
branches of the ontology hierarchy (Khatri 
and Draghici 2005; Rhee et al. 2008; Tomczak 
et al. 2018). Despite this, such analysis is 
widely popular in the bioinformatics commu- 
nity and has resulted in over 100 tools listed 
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on the GO website’ and over 7000 citations 
to the landmark paper on the Gene Ontology 
(Ashburner et al. 2000). 

Disease and drug ontologies can be used 
to perform enrichment analysis in a manner 
similar to GO-based analyses for gene expres- 
sion data (Subramanian et al. 2005; LePendu 
et al. 201 1a). Just as scientists can ask Which 
biological process is over-represented in my 
set of interesting genes or proteins, we can 
also ask Which disease (or class of diseases) 
is over-represented in my set of interesting 
genes or proteins? For example, by annotating 
known protein mutations with disease terms, 
Mort et al. were able to identify a class of 
diseases—blood coagulation disorders—that 
were associated with a lower than expected 
rate of amino acid substitutions at O-linked 
glycosylation sites (Mort et al. 2010). 


26.7 Natural Language Processing 
for Information Extraction 


Ontologies are also useful in the context of 
extracting information from a body of text. 
In-depth methods for natural language pro- 
cessing are discussed in > Chap. 7. Here we 
describe some applications in the context of 
translational research. 


26.7.1 Mining Electronic Health 


Records 


Researchers have shown that is possible to 
profile patient cohorts from EHRs using a 
variety of ontologies including SNOMED 
CT, MedDRA, and RxNorm (LePendu et al. 
2011b). For example, LePendu et al. devel- 
oped methods to annotate clinical text and 
methods for the mining of the resulting anno- 
tations to compute the risk of having a myo- 
cardial infarction on taking Vioxx (rofecoxib) 
for Rheumatoid arthritis. Subsequently they 
demonstrated that it is possible to apply anno- 


7 > http://www.geneontology.org/GO.tools. 
shtml#alphabet (Accessed 12/3/2012). 


tation analysis methods for detecting drug 
safety signals using electronic medical records 
up to 2 years before a drug’s recall (LePendu 
et al. 2013). 

Mining EHR data has also been pro- 
posed as a solution to the challenge of the 
large number of subjects that are needed for 
genome wide association studies (GWAS). 
Patients are increasingly able to consent to, or 
in some cases to opt out of, allowing excess 
biospecimens taken in the course of clinical 
care to be used in a de-identified fashion for 
genomic testing. Even for relatively strong 
genetic effects, GWAS requires thousands 
of individuals for sufficient statistical power 
(> Chap. 11). For weaker effects, tens of 
thousands of subjects are likely to be needed. 
Although the cost of genotyping continues 
to decrease, recruitment and sample collec- 
tion for these large numbers is both costly and 
labor-intensive. Leveraging the health care 
system and EHRs for research recruitment 
offers a potential approach to circumvent this 
problem. Ritchie et al. demonstrated the feasi- 
bility of this approach by using EHR data and 
an associated biobank to replicate a number 
of previously discovered genotype-phenotype 
associations (Ritchie et al. 2010). 

One major initiative in this area is the 
eMERGE (Electronic Medical Records and 
Genomics) Network, whose initial aim was to 
demonstrate that data captured through rou- 
tine clinical care are sufficient to identify cases 
and controls accurately for GWAS (Thorisson 
et al. 2005). As of 2018, the eMERGE consor- 
tium includes eleven institutions with DNA 
repositories and associated electronic medical 
record systems. For each site, ontology-based 
data extraction and natural language process- 
ing algorithms are applied to the EHR in order 
to determine phenotypes such as dementia, 
cataracts, peripheral artery disease, type 2 
diabetes, and cardiac conduction defects. This 
analysis is performed in a high-throughput, 
scalable fashion with results compared to a 
manually curated gold standard in order to 
determine positive and negative predictive val- 
ues for cases and controls for the phenotypes 
in question (Kho et al. 2011). The consortium 
also looks at cross-institutional algorithm 
application, ethical, legal, and social issues 
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around DNA biobanks, and the potential for 
future incorporation of GWAS findings into 
clinical care (Liu et al. 2012; Rohrer Vitek 
et al. 2017; Wan et al. 2017). 

These types of EMR-associated biobank 
resources enable anumber of other approaches 
to data mining. For example, Denny et al. used 
BioVU at Vanderbilt University to perform 
what they called a “PheWAS,” or a system- 
atic, high-throughput phenome-wide asso- 
ciation scan (Denny et al. 2010). Instead of 
measuring whole genomes across thousands 
of patients in order to find a gene associated 
with a phenotype in question, they measured 
only five alleles across thousands of patients 
and performed enrichment analysis for vari- 
ous diseases based on ICD9 codes. They then 
were able to reproduce known associations 
between those genes and certain diagnoses, 
and to generate new hypotheses for associa- 
tions between these genes and other diagnoses 
that were statistically enriched for a given gen- 
otype. The ability to connect, at a molecular 
level, diseases that were not previously asso- 
ciated can have implications for therapeutic 
intervention (Denny et al. 2016). 


26.7.2 Dataset Annotations 


In addition to EHRs, public repositories for 
omics-scale datasets remain a valuable but 
underutilized resource for data mining. Upon 
submission, these datasets typically con- 
tain only free-text descriptions. Addressing 
the lack of annotations, researchers demon- 
strated that translational analyses are enabled 
by automatically annotating tissue and gene 
microarray datasets with ontology terms 
(Shah et al. 2009a; Doan et al. 2014). Butte 
et al. employed a crowd-sourcing approach to 
annotate samples from the Gene Expression 
Omnibus (Hadley et al. 2017). Such auto- 
mated annotation approaches have been gen- 
eralized to create systems that process the free 
text metadata of diverse database elements 
such as gene expression data sets, descriptions 
of radiology images, clinical-trial reports, and 
PubMed article abstracts to annotate and 
index them with concepts from appropriate 
ontologies (Jonquet et al. 2011). Doing so 
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has enabled novel analyses from already col- 
lected molecular and clinical data (Garber 
et al. 2017; Sweeney et al. 2018a, b). Such 
annotation represents a large part of the work 
required to address the ‘F’ in FAIR data, 
making data ‘findable.’ 

The utility of consistent annotation of 
research datasets is now widely accepted. As a 
result, there are several initiatives to build tools 
for consistent meta-data assignment, as well 
as indices of available datasets correspond- 
ing to specific terms of interest. CeDAR, the 
Center for Expanded Data Annotation and 
Retrieval, was formed with a goal of devel- 
oping information technology to facilitate 
authoring and adoption of metadata (Musen, 
et al. 2015). BioCADDIE (Biomedical and 
healthCare Data Discovery Index Ecosystem) 
is an international effort to promote biomedi- 
cal data discovery through the creation of a 
data discovery index called DataMed (Chen 
et al. 2018). 


26.8 Network Analysis 


Biology lends itself in various ways to mod- 
eling through networks or graphs. The term 
“graph” simply refers to a set of nodes or 
circles connected by a set of edges or lines. 
In a molecular context, a node represents a 
molecular entity, and an edge represents some 
form of relationship between those molecular 
entities. This relationship may be a physical 
interaction (e.g., binds to), an influence (e.g., 
activates), or a similarity (e.g., is co-expressed 
with), among other possibilities. One fre- 
quently sees graphical models of gene regu- 
latory networks, protein-protein interactions, 
and signaling cascades. The set of all of these 
sorts of physical interactions has been referred 
to as the interactome (Barabasi et al. 2011). 
Studying this interactome and its properties 
from a graph theory perspective enables use- 
ful insights regarding gene modules and path- 
ways, and how these are disrupted in disease. 
A number of researchers have attempted to 
develop gene association networks using gene 
expression data either alone or together with 
other sources of network data such as protein- 
protein interactions. The general idea is that 
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O Fig. 26.7 Three possible causal relationships 
between two co-expressed genes. a Gene X affects Y. b 
Gene Y affects X. ce Both X and Yare affected by a third 
causal gene Z 


co-expressed genes are likely to interact with 
each other or participate in the same pathway. 
But of course, correlation does not equal cau- 
sality, and to be useful from a translational 
perspective, it is important to know the direc- 
tionality of the influence between two mol- 
ecules. Consider two genes, X and Y, whose 
expression is correlated (see B Fig. 26.7). 
One can conclude that the genes interact in 
some way, whether directly or indirectly (i.e., 
through another molecule). However, without 
additional knowledge of any sort, we cannot 
know whether X influences Y (@ Fig. 26.7a), 
or Y influences X (B Fig. 26.7b) or they share 
a third causal gene, Z (@ Fig. 26.7c). Which 
model represents the true underlying rela- 
tionship is important to know because if Y 
is involved in poor outcome, then targeting X 
will help to alleviate this condition in the first 
model, but not in the second or third. 

One way to determine the actual underly- 
ing relationship, used frequently in model sys- 
tems, is to actively perturb a specific variable 
in the system. If the other molecule changes 
accordingly, then we know that the perturbed 
variable was causal. This is the approach fre- 
quently used in a systems biology approach 
(see > Chap. 9). Unfortunately, this is much 
harder to do in human beings than in E. coli or 
yeast. One clever approach to determination 
of causality in human biological networks is 
to integrate gene expression with genotypic 
information, in which case DNA sequence can 
be assumed to be the independent variable. If 
differential gene expression is correlated with 
differential genotype, one can conclude that 
the genotype caused the gene expression pat- 
tern and not the other way around. This is the 


basis for the approach taken by Eric Schadt 
et al. to develop probabilistic causal networks 
which can then be used to identify key drivers 
of disease (Zhu et al. 2008). 

Network analysis in translational research 
need not be confined to concrete objects such 
as molecules. The Human Disease Network is 
a graphical model where nodes represent both 
known disease genes and disorders, linked by 
known associations between a given gene and 
disease (Goh et al. 2007). @ Figure 26.8 shows 
the “diseaseome” bipartite network, as well 
as the Human Disease Network, which con- 
nects diseases based on common genes, and 
the Disease Gene Network, connecting genes 
based on diseases in common. Combining 
these disparate data types enables a graph 
theoretic approach to study the genetic basis 
for disease. Using this framework, one can 
analyze similarity between genes based not 
on co-expression or GO term annotation but 
based on the pathologies in which a gene is 
known to be involved. Such similarities could 
easily go undetected through gene expression 
analysis if, for example, the different diseases 
are caused by over-activation or inhibition 
of the gene respectively. A disease-gene net- 
work also enables the comparison of diseases 
not traditionally studied together, based on 
common underlying molecular mechanisms. 
Identifying disease similarities based on gene 
expression requires that one analyze expres- 
sion data from those two diseases together 
in the first place, making it more difficult 
to discover novel, previously unsuspected 
relationships. 

Building upon the Human Disease 
Network, Yildirim et al. created a network of 
drug-gene target interactions, thus enabling 
an additional layer of analysis regarding 
similarity between different drugs based on 
targeted genes, and between target molecules 
based on the drugs that target them (Yildirim 
et al. 2007). This type of network can be used 
as the basis for a number of different obser- 
vations, including trends in drug develop- 
ment over time. For example, analysis of the 
structure of the graph revealed significant 
clustering of drug-gene interactions, suggest- 
ing a significant “me too” pattern to drug 


Translational Bioinformatics 


disease phenome 


Human Disease Network 
(HDN) 


ghoblastic leukemia 


giectasia 


O Fig. 26.8 The Human Disease Network. The middle 
panel shows a small subset of the bi-partite gene-disease 
network based on OMIM (Online Mendelian Inheri- 
tance in Man) gene-disease relationships. The Human 
Disease Network on the left shows diseases as nodes, 
with connections representing common related genes. 


development (see @ Fig. 26.9). Inclusion of 
drugs still under investigation, i.e., not yet 
FDA approved at the time of analysis, dem- 
onstrated that the breadth of drug targets is 
expanding, suggesting a trend toward target 
diversity. Incorporating the cellular compo- 
nent of target proteins showed that the distri- 
bution of cellular location for target proteins, 
previously nearly two-thirds membrane- 
associated, is becoming more diverse, better 
matching the known distribution for disease 
proteins. Finally, this group incorporated 
protein-protein interaction data to facilitate 
the study of network properties of drug target 
gene products. They looked for the shortest 
path between drug target genes and known 
disease genes for the disorder that drug was 
intended to treat and found that this number 
appears to be decreasing over time, suggest- 
ing that drugs are moving from a palliative 
approach (i.e. treating the symptoms and not 
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The Disease Gene Network on the right depicts genes as 
node, with connections indicating that they have been 
implicated in one or more of the same disorders. (From 
Goh et al. (2007), ©2007 National Academy of Sciences, 
U.S.A. with permission) 


the cause) to rational drug design (Yildirim 
et al. 2007). More recently, Hu et al. used a 
network-based approach to look at comor- 
bidities and the effects of environmental 
perturbations (Hu et al. 2016). Using this 
approach they were able to generate hypothe- 
ses for molecular mechanisms of comorbidity 
which in turn can facilitate drug repurposing 
and the development of targeted therapeutics. 


26.9 Basepairs to Bedside 


Although the sequenced human genome has 
not been a panacea for human disease, it has 
enabled the beginnings of a new approach 
to human health and to the practice of pre- 
cision medicine (Collins and Varmus 2015). 
As the price of genomic sequencing falls and 
as our knowledge regarding the meaning of 
genomic variation increases, genotypic data 
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O Fig. 26.9 Yildirim et al.’s Drug-Target network. 
Circles represent FDA-approved drugs and rectangles 
represent target proteins. Diseases are color-coded by 
anatomical system and protein targets according to 


is poised to become a standard component 
of a person’s medical record. In this section 
we describe the translational path of genom- 
ics, from sequencing in the lab to clinical rel- 
evance for individuals. 


26.9.1 Whole Genome Sequencing 


= Technologic Advances 
The DNA-probe approach to genotyping, 
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their cellular location. Clusters of drugs associated with 
one target reflect the pharmaceutical industry’s ten- 
dency to develop ‘follow-on’ drugs. (Courtesy of Albert- 
Laszlo Barabasi, MD, with permission) 


this way, whether because the SNPs them- 
selves were of interest, or due to genetic link- 
age—the tendency for alleles located close to 
one another on a chromosome to be inherited 
together. However, more recent findings have 
demonstrated that the concept of “common 
disease-common variant” is flawed (Zhu et al. 
2011). Indeed, there has been some disap- 
pointment in the extent to which GWAS has 
been able to explain common diseases with 
known genetic components (Manolio et al. 


described in » Chap. 9, may be compared 2009). Whole genome or, in some cases, whole 


to looking for one’s car keys under the pro- 
verbial street lamp. That is, the technology 
shines the light on a certain portion of the 


exome sequencing allows researchers to iden- 
tify rare variants (i.e., those with a minor allele 
frequency of <1%) that account for genetic 


genetic landscape, and that is where we look. disease. Advances in sequencing technolo- 


Which SNPs are included on a chip is deter- 


gies (e.g., “nextgen” sequencing, and third- 


mined in large part by which SNPs have been generation sequencing, see > Chap. 9) and the 


detected in the past, for example through the 
HapMap project (Kang et al. 2006). A num- 
ber of new associations have been found in 


corresponding decrease in cost, make genome- 
scale sequencing increasingly feasible in trans- 
lational research and even in clinical care. 
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a Whole Genome Versus Exome 

Even with recent advances in genome 
sequencing technology, the cost to sequence a 
full genome at a rate of coverage that enables 
the identification of novel SNPs is still sig- 
nificant, on the order of one thousand dol- 
lars. However, recall that only about 1% of 
the genome actually codes for proteins and 
85% of known disease-causing mutations 
with large effects occur in proteins (Choi 
et al. 2009). One way to further decrease the 
time and cost of sequencing is to look at only 
those stretches that code for actual proteins. 
This can be justified because most variants 
known to underlie Mendelian disorders dis- 
rupt protein-coding sequences. Of course, 
this approach will miss causal variations if 
they exist in the other 99% of the genome. 
Moreover, a recent cluster of publications 
from the ENCODE (Encyclopedia of DNA 
Elements) Consortium asserts assignment of 
biochemical function for 80% of the genome 
(Dunham et al. 2012). Some additional com- 
ponents, such as regulatory regions or splice 
acceptor and donor sites may be included as 
well to increase sensitivity without incurring 
significant additional cost. Exome sequencing 
in a small number of individuals has been used 
to identify the causal variant for rare diseases 
such as Miller’s syndrome, a multiple malfor- 
mation disorder (Ng et al. 2010), and Proteus 
syndrome, a disorder causing the overgrowth 
of tissues and organs, thought to have afflicted 
the nineteenth century Englishman known as 
The Elephant Man (Lindhurst et al. 2011). 


= Genomic Data Sharing 

In 2008, researchers demonstrated that the 
presence of a single genome within a complex 
mixture of DNA samples could be ascertained 
(Homer et al. 2008). This caused both NIH 
and the Wellcome Trust? to limit access not 
only to individual genomes, but to aggregate 
genomic information as well. (Note that the 
ability to determine the presence or absence 
of an individual’s DNA in a heterogeneous 
sample presupposes the availability of detailed 


8 » http://www.wellcome.ac.uk/ (Accessed 
10/28/2018). 
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genomic information about the individual in 
question.) These actions prompted responses 
that ranged from “too little, too late” to “a 
heavy-handed bureaucratic response to a 
practically minimal risk that will unnecessar- 
ily inhibit scientific research” (Church et al. 
2009). Current NIH policy allows investi- 
gators to submit a data access request to be 
reviewed by an NIH Data Access Committee. 
Access to data is granted once a Data Use 
Certification is co-signed both by the inves- 
tigator and the appropriate official[s] at the 
investigator’s affiliated institution.’ 

The Global Alliance for Genomics & 
Health is a policy and standards oriented 
organization aimed at enabling responsible 
data sharing (Terry 2014). It was founded 
in 2013 when 50 colleagues from 8 countries 
met to discuss challenges and opportuni- 
ties in genomic research and medicine, and 
now comprises more than 500 organizational 
members from 71 different countries. A num- 
ber of Work Streams and Driver Projects 
guide development efforts and serve to pilot 
the organization’s tools. In a similar vein, the 
RDA (Research Data Alliance) is an inter- 
national organization meant to promote 
data sharing and data driven research, and 
to develop and promote the technical infra- 
structure to that end. The RDA’s scope goes 
beyond genomic and biomedical data, but the 
mission is highly aligned with GA4GH. One 
of the RDA’s many Working Groups is 
called “FAIRSharing Registry WG: connect- 
ing (meta) data standards, repositories, and 
policies, 10” 

Additional online genomic resources 
include TCGA (The Cancer Genome Atlas) 
and WTCCC (Wellcome Trust Case Control 
Consortium). TCGA is a joint effort of the 
National Cancer Institute (NCI) and the 
National Human Genome Research Institute 
(NHGRI) to accelerate understanding of the 
molecular basis for cancer through appli- 


9 > http://grants.nih.gov/grants/guide/notice-files/ 
NOT-OD-07-088.html (Accessed 10/29/2018). 

10 > https://rd-alliance.org/group/fairsharing-regis- 
try-connecting-data-policies-standards-databases. 
html (Accessed 10/29/2018). 
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cation of genomic technologies, including 
genome sequencing.!! The WTCCC, estab- 
lished in 2005, comprises 50 research groups 
across the UK who have performed a series 
of genome-wide association studies and made 
the data available through application to a 
Consortium Data Access Committee. !? 


26.9.2 Here Are Some Human 
Beings 


a The Personal Genome Project (PGP) 
Promising though genomic medicine may 
be, much remains to be worked out techni- 
cally, scientifically, and from the ELSI (ethi- 
cal, legal, and social implications) perspective 
(see > Chap. 12). Some notable pilot projects 
have been embarked upon in order to catalyze 
progress in all of these areas. Craig Venter was 
the first person to have his complete genome 
published in 2007. Since then, a number of 
human genomes have been sequenced, and 
some of those have been made available in the 
public domain. The question becomes: now 
what? What can any given individual learn 
from his or her complete genomic sequence? 
What does an individual want to learn, or not 
want to learn, as the case may be? The only 
reliable way to answer these questions is with 
empirical input. 

George Church is a pioneer in genomic 
sequencing, inventor of the Polonator 
sequencer, and founder of personal genome 
sequencing company Knome. In 2005, 
Church started the Personal Genome Project 
(PGP), ultimately aiming to sequence 100,000 
individuals in order to advance understand- 
ing of how genes contribute, along with 
environment, to human traits. The project 
“hopes to make personal genome sequenc- 
ing more affordable, accessible, and useful 
for humankind.”!? A vanguard of ten vol- 


11 > http://cancergenome.nih.gov/ (Accessed 
10/29/2018). 

12 » https://www.wtccc.org.uk/index.shtml (Accessed 
10/29/2018). 

13 » http://www.personalgenomes.org/ (Accessed 
10/29/2018). 


unteers—the PGP-10—were selected to have 
their genomes sequenced. This endeavor dif- 
fers from other projects in one crucial way: in 
addition to making the sequence data publicly 
available, complete phenotypic data, includ- 
ing personal and health information, fam- 
ily history, and even name and photographs 
would be shared as well. This was a departure 
for the type of projects the NIH typically 
funds and supports. Generally, informed con- 
sent includes information on how the research 
team plans to secure privacy and confidenti- 
ality for the subject. In this case, sharing of 
personal data was part of the protocol itself. 
The first set of integrated data from this group 
was made available in October 2008. 

Making this type of data both publicly 
available and personally identifiable was 
stepping out into socio-scientific terra incog- 
nita, generating some worry that it could 
affect health care, employment, insurance, 
and more. In 2008, the Genetic Information 
Nondiscrimination Act (GINA) was signed 
into law, but its scope is limited to employ- 
ment and health care insurance. It does not 
address life, disability, or long term care insur- 
ance (Hudson et al. 2008; Tenenbaum and 
Goodman 2017). Though rare, there are a few 
notorious examples of lawsuits where employ- 
ers performed genetic and health-related 
testing on employees without their consent 
(Angrist 2010), and though unlikely, the PGP 
warns prospective participants that their DNA 
could be artificially synthesized and planted at 
a crime scene (Lunshof et al. 2010). 

As much as the PGP has pushed the 
boundaries and helped to advance the tech- 
nology, data management, and clinical issues 
involved with personal genomes, and will con- 
tinue to do so, it also serves as a weather bal- 
loon from the ELSI perspective, generating 
empirical data on sociological atmosphere, 
ethical pressures, and legal winds of change 
(see > Chap. 12 for additional discussion on 
these points). Misha Angrist, a bioethicist at 
Duke University, is PGP Participant #4. As 
documented in his book Here Is a Human 
Being: At the Dawn of Personal Genomics, the 
early sequencing was slow going, the technol- 
ogy took time to work out the kinks, and the 
preliminary results were underwhelming even 
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to the individuals who had been sequenced 
(Angrist 2010). The infrastructure is not yetin 
place to empower someone with his complete 
genomic profile to do much with that infor- 
mation. Angrist describes his own attempts 
to make use of tools for genomic interpreta- 
tion—SNPedia,'* Sequence Variant Analyzer 
(Ge et al. 2011), and the Church lab’s open 
source Trait-o-Matic'’>—which he compared 
to the dial-up days of the internet. Out of all 
of the variants carried by the PGP10, only 
one was deemed serious. Steven Pinker carried 
a mutation for MYL2, which had been shown 
in some cases to cause hypertrophic cardiomy- 
opathy (Angrist 2010). 

The resource created by the PGP 
enabled the Critical Assessment of Genome 
Interpretation (CAGI) to create a community 
challenge to assess the ability to predict traits 
from whole genomes in which researchers 
were asked to predict whether an individual 
had a particular trait or profile based on their 
whole genome. Overall, findings showed that 
predicting individual traits is difficult and that 
matching genomes to trait profiles depends 
strongly on a small number of common traits 
like ancestry, blood type, and eye color (Cai 
et al. 2017). Equally important, however, the 
project has created a publicly available, inte- 
grated resource for genomic, environmental, 
and trait (GET) data (Lunshof et al. 2010) 
and an empirical test bed for tackling the ELSI 
issues brought to bear by such a resource. 


= A personal genome for clinical assessment 
As another proof of concept, collabora- 
tors at Stanford and Harvard did a complete 
sequencing, analysis, and genetic counseling 
for a 40-year-old male with family history of 
sudden death from cardiac arrest (Ashley et al. 
2010). The goal was to determine how whole 
genome sequencing would translate to clini- 
cal application. The patient was found to have 
increased risk for myocardial infarction, type 2 
diabetes, and some cancers. While most of the 


14 » http://www.snpedia.com/index.php/SNPedia 
(Accessed 12/6/2012). 

15 » https://github.com/xwu/trait-o-matic/ 
wiki(Accessed 12/19/2012). 
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findings were not actionable, the patient had 
both increased risk for cardiovascular disease 
and genetic disposition to benefit from the use 
of statins and aspirin. Despite this, just over a 
year after publication, the patient maintained 
that he had “not been convinced that statins 
or aspirin would have enough beneficial effect 
relative to their risks,” and had not therefore 
changed his pharmaceutical behavior (Quake, 
S, 2011, personal communication). 


Just over a year after the Quake profile, the 
same group published their findings from per- 
forming whole exome sequencing on the first 
healthy nuclear family (Dewey et al. 2011). 
They generated an ethnically concordant 
reference sequence (i.e. a reference sequence 
based on a European population, reflecting 
the European background of the family in 
question), which enabled increased accuracy 
for rare mutations. Findings included high 
resolution inference of sites of recombination 
i.e., where the parents’ chromosomes “cross 
over” during meiosis), and a novel approach 
to HLA (Human Leukocyte Antigen) typ- 
ing—important for risk in a number of dis- 
eases, particularly autoimmune disorders. 
For the family in question, they were able to 
determine that the father had passed down to 
his daughter a mutation for Factor V Leiden 
that poses increased risk for blood clotting. 
This is actionable information for women 
as the Factor V mutation is a contraindica- 
tion for estrogen-based birth control pills 
(Singer 2011), and inherited thrombophilia 
is a known risk factor for pregnancy out- 
comes (Tenenbaum et al. 2012). Note that 
Factor V mutations are also included in chip- 
based genotyping services, so whole genome 
sequencing was not the key enabling technol- 
ogy in this case. 

One key item reported in the paper by 
Ashley et al. was the fact that, in the absence 
of a centrally curated resource of all rare 
and disease-associated variants, the authors 
spent hundreds of hours reviewing databases. 
Moreover, the work was a collaborative effort 
among a number of highly trained experts in 
clinical genetics, genetic counseling, bioinfor- 
matics, internal medicine, pharmacogenom- 
ics, etc. (Ormond et al. 2010). Clearly new 
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tools, automation, and infrastructure, as well 
as a whole new paradigm in genetic counsel- 
ing, are required to make incorporation of 
genomic data into health care feasible for the 
population at large. 


= Ethical, Legal, and Social Issues (ELSI) 
Pursuit of genomic medicine raises a number 
of ethical, legal, and social issues (see also 
> Chap. 12). Some worry that people are ill- 
equipped to process the results of these tests. 
But it is not clear that a paternalistic approach 
is a better alternative; there was a time when 
it was considered acceptable for a doctor not 
to disclose a cancer diagnosis to the patient 
himself (Novack et al. 1979). In addition, new 
discoveries are being made all the time— what 
are the obligations to follow up if something 
new (and dire? and actionable?) is discovered 
about a given subject? Other questions include 
whether enough is known for the results to be 
of any practical use, whether the service should 
be provided outside of the context of a rela- 
tionship with a clinical caregiver, and whether 
results could have detrimental effects on a per- 
son’s ability to secure health insurance. Some 
states have banned the services, others have 
made stipulations requiring clinician involve- 
ment and CLIA certification for the labs that 
handle the samples and process the results. 16 


In a companion article to the Quake pro- 
file, it was asserted that consent for a process 
in which the risks of knowledge gained are 
not wholly understood is more complex than 
for simple genetic testing. People have trouble 
interpreting probabilities. Patients must be 
advised that they may find out things they did 
not want to know about. The eminent scientist 
James Watson made a point of requesting that 
his ApoE status be redacted from the release 
of his full genome because he did not want to 
know if he was at risk. His grandmother had 
died of Alzheimer’s at 83, and he did not want 
to worry that every subsequent memory lapse 
marked the onset of dementia (Angrist 2010). 


16 » http://www.genomeweb.com/dxpgx/will-other- 
states-follow-ny-calif-taking-dtc-genetic-testing- 
firms (Accessed 12/6/2012). 


Statistics predict that any given patient will 
find out he is a carrier for some lethal autoso- 
mal recessive disease. Illness aside, the average 
global non-paternity rate has been estimated 
to be as high as 10% (Olson 2007), though it 
is likely closer to 1% (Larmuseau et al. 2016). 
Genetic information could also have impli- 
cations for the patient’s children, present or 
future, and for other family members. Patients, 
this group concluded, must have access to 
trained professionals to provide answers to 
their questions, where answers exist. This will 
be difficult, lengthy, and expensive, but not to 
do it would undermine the consent process 
(Ormond et al. 2010). 

Although knowing the “parts list” for the 
human genome is an important step, much 
remains to be understood about how genes 
factor into human health and disease. For 
most diseases, the environment plays as much, 
if not more, of a role as a person’s DNA. Aside 
from some notable, deterministic exceptions 
such as Huntington’s disease, most known 
risk alleles confer fairly low odds ratios unto 
themselves (see >» Chap. 3), making an indi- 
vidual, for example, approximately 1.1 times 
as likely as the average individual to develop a 
given condition. Even when ratios are as high 
as, say, twofold, it is of dubious actual util- 
ity to know that based on one’s genotype, the 
odds of being diagnosed with Crohn’s disease 
went from 0.5 in 100 to 1 in 100. 

For certain disease markers, such as 
Alzheimer’s or BRCA1 and BRCA2, it was, 
and largely still is, unknown what impact 
negative results might have on a customer’s 
mental and emotion well-being. Some studies 
have shown that while a person experiences 
negative emotions immediately in the wake 
of learning the bad news, over a time period 
of months there is no significant difference 
in anxiety, depression, or test-related distress 
(Green et al. 2009). In any case, DTC genetics 
companies’ websites must provide the ability 
to view sensitive results while protecting the 
customer from stumbling on these findings 
unintentionally. 23andMe, as an example, has 
spent considerable resources on the design 
of a user-friendly interface through which 
to present an individual’s “health reports,” 
or their individual genotype for markers that 
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a Show information for Person X 


and an age range of 0-79 


Person X 
38.7 out of 100 


women of European ethnicity 
who share Jessica 
Tenenbaum's genotype will 
develop Venous 
Thromboembolism between the 
ages of 0 and 79. 


Average 
9.7 out of 100 


women of European ethnicity will 
develop Venous 
Thromboembolism between the 


ages of 0 and 79. 
b & 
2-fold 
Increased Risk 
O 
Y 
Average Risk 
© 
2-fold | 


Decreased Risk 


O Fig. 26.10 23andMe’s graphical representation of 
relative risk from their website before the FDA stepped 
in to regulate DTC genetic testing. a Colored figures 
represent the number of people on average out of 100 
who are likely to develop venous thromboembolism 
over the course of a lifetime. Green figures represent the 
individual’s personal reported risk; blue figures repre- 
sent the average risk for females of European descent. 
(Accompanying text has been shortened for clarity.) b 


have been characterized through reliable, 
established research methods. Along with a 
text explanation, these health reports give a 
graphical depiction of a person’s relative risk. 


assuming European 
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ethnicity 
Why are there limited choices of ethnicity in risk reports? 


What does the Odds Calculator show me? 

Use the ethnicity and age range selectors above to see the estimated 
incidence of Venous Thromboembolism due to genetics for women with 
this person's genotype [...] 


The 23andMe Odds Calculator only takes into account effects of 
markers with known associations that are also on our genotyping chip. 
...aside from genetics, environment and lifestyle may also contribute to 
one's risk for Venous Thromboembolism 


The individual’s relative risk for each of three reported 
markers: factor 5, factor 2, and ABO. (Specific values 
were displayed on the website when the user hovered the 
mouse over the colored bars.) In a later version of the 
website, risk for hereditary thrombophilia was based 
only on factor 2 and factor 5; ABO was no longer 
included. (© 23andMe, Inc. 2007-2012. All rights 
reserved; distributed pursuant to a Limited License 
from 23andMe) 


O Figure 26.10 shows such a graphic for an 
individual’s risk of venous thromboembolism 
from a circa 2012 version of the website, before 
the FDA began to regulate DTC genetic test- 


26 


898 J.D. Tenenbaum et al. 


ing. The graphical representation has been 
modified in more recent versions of the site 
to be less specific, more accurately reflect- 
ing the underlying uncertainty. For sensitive 
results such as BRCAI and 2, and markers 
for Alzheimer’s and Parkinson’s disease, the 
information is initially “locked.” Users must 
explicitly click through an additional screen 
to confirm that they truly want to know geno- 
type and relative risk for that trait. 


= Rulings and Regulations 
From a regulatory perspective, it was not ini- 
tially clear whether these services qualify as 
medical devices as defined by the FDA, and 
are therefore subject to regulation by the 
Agency. In fall of 2013, the FDA sent a letter 
to 23andMe ordering it to “immediately dis- 
continue marketing the PGS [Saliva Collection 
Kit and Personal Genome Service] until such 
time as it receives FDA marketing authoriza- 
tion for the device.” (Annas and Elias 2014) A 
few weeks later, 23andMe announced that it 
was complying with the FDA’s demands and 
suspending their health-related genetic tests. 
23andMe continued to offer (and market) their 
ancestry-related testing, and less than 2 years 
later announced FDA approval for a carrier 
screening test for Bloom syndrome, with word- 
ing that left the option open for additional car- 
rier screening tests without premarket review 
(Annas and Elias 2014, 2015). In October, 
2017, the company announced that it would 
offer genetic risk for ten medical conditions 
including Parkinson’s disease and late-onset 
Alzheimer’s (Check Hayden 2017). The DTC 
testing landscape is still rapidly evolving, and 
will continue to do so for the foreseeable future. 
Logistically, a prospective customer typi- 
cally registers on the DTC company’s web- 
site and a sample collection kit is sent in the 
mail, though 23andMe’s kit is also available 
for purchase through > Amazon.com and 
even on the shelves of local brick-and-mortar 
pharmacies. In May 2010, Pathway Genomics 
and Walgreens announced a plan to sell these 
kits in Walgreens drugstores, but the FDA 
sent a letter to Pathway Genomics indicating 
their belief that the company’s genomic report 
qualified as medical device (Bradley et al. 
2011) and as such required FDA approval. 


Plans to sell the saliva collection container 
in brick-and-mortar stores were put on hold 
until the regulatory issues could be resolved, 
or at least addressed. 

Another high profile legal issue is the case 
of Assoc. for Molecular Pathology v. Myriad 
Genetics, Inc., et al., regarding Myriad’s pat- 
ent on the BRCAI and BRCA2 genes, which 
were included in 23andMe’s offerings,'’ and 
more generally whether genes should be pat- 
entable at all. In 2011, a federal appeals court 
overturned a lower court in the case of and 
found that genes can, in fact, be patented 
(Pollack 2011). This ruling was upheld in a 
court of appeals in 2012, however, in 2013, 
the Supreme Court partially overturned that 
ruling and found that isolated genomic DNA 
(gDNA) is not patent-eligible, but cDNA is. 
Disappointingly, this ruling did not do much 
to reduce ambiguity around these issues. 


26.10 Challenges and Future 
Directions 


TBI as a discipline continues to evolve in an 
exciting and dynamic phase. Though chal- 
lenges remain, the field is poised to become 
an increasingly crucial element of biomedi- 
cal research and clinical practice in the era of 
precision medicine (Tenenbaum et al. 2016). 
We conclude this chapter with a discussion of 
future directions and key challenges for this 
burgeoning discipline. 


26.10.1 Expansion of Data Types 


Genomic data are already being used to guide 
clinical care. Genomic data themselves are 
relatively straightforward in that an individ- 
ual’s genome is relatively static, and through 
the intrinsic physical properties of ribonu- 
cleic acids and the transcriptional process, 
DNA and RNA are relatively easy to capture, 
observe, and quantify. Proteins and metabolites 
are more challenging in this regard. Proteomic 


17 » https://www.23andme.com/health/BRCA-Can- 
cer/ (Accessed 12/19/12). 
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and metabolomic methodologies have primar- 
ily centered around isotopic labeling, but more 
recent approaches enable unbiased label-free 
identification and even quantification (Du et al. 
2008; Wishart 2011). Identification of metabo- 
lites associated with disease has already enabled 
enzymatic drug targeting in diabetes, obesity, 
cardiovascular disease, and cancer, among 
other conditions (Chan and Ginsburg 2011). 
We expect that as proteomics and metabolo- 
mics standards and technologies continue to 
mature, they will play an increasingly signifi- 
cant role in translational research and practice. 

The role of epigenetics needs to be under- 
stood more fully. It is clear that the environ- 
ment can induce changes in the packaging 
and labeling of DNA. These environmental 
cues can include lifetime exposures to toxins, 
viruses, bacteria and nutritional compounds 
as well as drug exposures. Understanding 
the ways in which these epigenetic modifica- 
tions affect phenotype is in its infancy, and 
so we must understand how to measure these 
effects, and then compute with them. The 
human microbiome is also an active area for 
translational research. Though one might 
expect associations between the gut microbi- 
ome and various gastro-intestinal conditions, 
surprising correlations and even causal rela- 
tionships have been discovered with cancer, 
neurological, and even psychiatric disorders 
(Mayer et al. 2014; Zitvogel et al. 2015; Zheng 
et al. 2016; Clapp et al. 2017). 

Finally, as standards are developed and 
clinicians and researchers see the value 
to be gained from structured data collec- 
tion through studies such as The National 
Children’s Study (Landrigan et al. 2006), 
structured environmental data is likely to be 
increasingly available to complete the picture 
for gene-environment interactions (Schwartz 
and Collins 2007). 


26.10..2 Changes for Medical 
Training, Practice, 


and Support 


Clinicians will need enhanced training in 
genetics and other areas described above. 
Curricular components relating to genet- 
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ics, pharmacogenomics, statistics, and data 
standards will be increasingly important. 
Expertise in these fields will also need to be 
supplemented by an expanded workforce of 
genetic counselors. Increasingly, therapies will 
require accompanying diagnostic tests. As the 
opportunities for use of genomic data in clini- 
cal care continue to advance, it will become 
increasingly important to incorporate this 
information into both the electronic health 
record and into machine readable clinical care 
guidelines for clinical decision support. This 
in turn will require new standards to capture 
genomic findings, and new decision support 
tools to enable clinicians to incorporate this 
ever-increasing amount of information into 
their therapeutic decision making processes 
(Hoffman 2007). A number of standards 
exist in this space; the key will be in educat- 
ing prospective users and enforcing adoption. 
This applies to the full translational spectrum, 
from annotation of experimentally generated 
datasets to acommon format for the exchange 
of clinically relevant omic information 
between EHR systems. Most doctors have 
only a basic level of training in genetics, and 
are ill-equipped to answer in-depth questions 
from patients who bring to an appointment 
printouts of their results from these services 
(Frueh and Gurwitz 2004). More knowledge 
is required, in addition to training and tools, 
before family care providers, internists, and 
even specialists, are prepared to incorporate 
genomic information into their clinical prac- 
tice (Ormond et al. 2010; Chan and Ginsburg 
2011). 


26.11 Conclusions 


As the cost of data generation and storage 
continues to decrease, and the methods for 
data analysis and interpretation continue to 
advance, TBI is poised to be a key enabler of 
precision medicine (Tenenbaum et al. 2016). 
One can imagine a day when every newborn 
has his or her genome sequenced and this 
information becomes a part of the medical 
record, much as blood type is recorded today. 
The biggest challenges to achieving this vision 
are likely not to be technical ones, but rather 
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ethical, legal, and economic in nature (Schadt 
2012). Society must strike a balance between 
privacy protection and facilitating progress 
in biomedical research. Legal issues will need 
to be worked out around direct-to-consumer 
genetic testing, gene patenting, preventing 
genetic discrimination, and many other such 
issues. Return on investment will need to be 
established through economic analysis com- 
bined with comparative effectiveness research 
(see > Chaps. 11 and 26). Ultimately, some- 
one will have to pay for these accompanying 
diagnostic tests. Major change is unlikely until 
an organization like the Center for Medicare 
and Medicaid Services (CMS) changes its 
policies. For example, CMS coverage for the 
genetic test to guide warfarin dosing is cur- 
rently conditional upon it being ordered as 
part of a research protocol (Meckley and 
Neumann 2010). TBI will continue to play a 
key role in transforming these types of scien- 
tific discoveries into improvements in human 
health. 
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personal genome interpretation. Briefings in 
Bioinformatics, 13(4), 495-512. The authors 
of this review summarize key databases and 
bioinformatics tools that have been developed 
in recent years to aid in the interpretation of 
genomic variance. Resources covered include 
databases of variants, genotype/phenotype 
annotation databases, tools for gene prioriti- 


zation and tools for interpretation of single 
nucleotide variants. 

Davies, K. (2010). This text, written by the editor 
of BioIT World magazine, documents the 
characters, events, and issues in the race to 
achieve the $1000 Genome. In The $1000 
genome: The revolution in DNA sequencing 
and the new era of personalized medicine. 
New York: Free Press. 

Hastie, T., Tibshirani, R., & Friedman, J. H. 
(2009). The elements of statistical learning : 
Data mining, inference, and prediction. 
New York: Springer. A useful primer on the 
statistical concepts underlying machine learn- 
ing approaches to biomarker discovery. 

Kann, M. G, & Lewitter, F. (Eds.). (2012). 
Translational bioinformatics. PLOS 
Computational Biology Collections eBook. 
This eBook represents both the first “text- 
book” devoted entirely to TBI, and the first 
online, open access textbook from PLOS. In 
addition to many of the topics covered in this 
chapter, the collection includes chapters on 
related topics such as cancer genome analysis, 
micribiome analysis, structural variation, and 
protein interactions in disease. 

Masys, D. R., Jarvik, G. P., Abernethy, N. F., 
Anderson, N. R., Papanicolaou, G. J., Paltoo, 
D. N., Hoffman, M. A., Kohane, I. S., & Levy, 
H. P. (2012). Technical desiderata for the inte- 
gration of genomic data into electronic health 
records. Journal of Biomedical Informatics, 
45(3), 419-422. The authors describe the char- 
acteristics of biomolecular data that differen- 
tiate it from other EHR data, enumerate a set 
of technical desiderata for management of 
biomolecular data in clinical settings (e.g., 
separation of molecular data observations 
from clinical interpretation, lossless data com- 
pression, support for readability by both 
humans and machines), and propose a techni- 
cal approach to its representation. 

Sarkar, I. N., & Payne, P. R. O. (2011, December). 
The joint summits on translational science: 
crossing the translational chasm. Journal of 
Biomedical Informatics, 44(Suppl 1), S1-S2. 
This editorial discusses the spectrum of bio- 
medical informatics, from biology to medicine, 
in the context of the NIH Roadmap and the 
Clinical and Translational Science Award pro- 
gram. It gives the history of the AMIA Joint 
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Summits on Translational Science, and explains 
the emergence of TBI and CRI as disciplines 
unto themselves, intended to address the same 
issues that motivated those initiatives- namely 
translating scientific discoveries into meaning- 
ful changes in health care delivery. 

Sarkar, I. N., Butte, A. J., Lussier, Y. A., Tarczy- 
Hornoch, P., & Ohno-Machado, L. (2011). 
Translational bioinformatics: Linking knowl- 
edge across biological and clinical realms. 
Journal of the American Medical Informatics 
Association, 18, 354-357. The authors pres- 
ent the field of TBI in the context of suc- 
cesses from bioinformatics and health 
informatics. 


® Questions for Discussion 

1. Should DTC genetic testing for 
health-related traits be regulated by 
the FDA? 

2. Should genes be patentable? 

3. Are there sufficient legal protections in 
place to prevent discrimination based on 
genomic information? If not, what regu- 
lations are needed? 

4. Are we headed toward full disclosure of 
genomic information? 

5. What are some reasons a researcher 
might not want to share research data? 
Should they be required to share? If so, 
under what circumstances (e.g., 
6 months after first publication)? 

6. For novel analyses applied to complex, 
high-dimensional datasets, should there 
be new guidelines in place to prevent 
reporting erroneous results through user 
error or data fraud? Why or why not? 

7. What are the major barriers to 
incorporating the benefits of 
personalized medicine fully into 
standard practice? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What is clinical research and what fac- 
tors influence the design of clinical 
studies? 

= What are the types of information 
needs inherent to clinical research and 
how can those information needs be 
stratified by research project phase or 
activity? 

= What types of information systems can 
be used to address or satisfy the infor- 
mation needs of clinical research teams? 

= How can multi-purpose platforms, such 
as electronic health record (EHR) sys- 
tems (see ® Chap. 14), be leveraged to 
enable clinical research? 

= What is the role of a clinical trial or 
research management system (CTMS/ 
CRMS) for supporting and enabling 
clinical research, and what types of 
functionality are common to such sys- 
tems? 

= What is the role of standards in sup- 
porting interoperability across and 
between actors and entities involved in 
clinical research? 

= What are current and future clinical 
research informatics (CRI) research 
and development questions and how 
will they optimize or otherwise alter the 
conduct of clinical research? 


27.1 Introduction 


The conduct of clinical research is fundamen- 
tal to the generation of evidence that can in 
turn facilitate improvements in human health. 
However, the design, execution, and analysis 
of clinical research is an inherently complex 
information- and resource-intensive endeavor, 
involving a broad variety of stakeholders, 
workflows, processes, data types, and compu- 
tational resources. At the intersection point 
between biomedical informatics and clinical 
research, a robust and growing sub-discipline 
of informatics has emerged, which for the 
remainder of this chapter we will refer to as 


Clinical Research Informatics (CRI) (P. J. 
Embi and Payne 2009). Numerous reports 
have shown that innovations and best prac- 
tices generated by the CRI community have 
contributed to improvements in the quality, 
efficiency, and expediency of clinical research 
(P. Embi 2013; P. J. Embi and Payne 2009, 
2013; Johnson et al. 2016; Payne et al. 2005; 
Weng and Kahn 2016). Such benefits can 
be situated in a full spectrum of contexts 
that extends from the activities of individual 
clinical investigators to the operations of 
multi-center research consortia that involve 
geographically and temporally distributed 
participants. 

Given the recognition of CRI as a distinct 
and increasingly important sub-discipline of 
biomedical informatics, it is imperative that a 
common basis for defining and understand- 
ing CRI science and practice be established. 
Such a foundation must, by necessity, include 
explicit linkages to the major challenges and 
Opportunities associated with the planning, 
conduct, and evaluation of clinical research 
programs. To provide a common frame of ref- 
erence for the remainder of this chapter, we 
will use the National Cancer Institute’s (NCI) 
definition of clinical research, as follows: 


» Research in which people, or data or samples 
of tissue from people, are studied to under- 
stand health and disease. Clinical research 
helps find new and better ways to detect, 
diagnose, treat, and prevent disease. Types of 
clinical research include clinical trials, which 
test new treatments for a disease, and natural 
history studies, which collect health informa- 
tion to understand how a disease develops and 


progresses over time.! 


A lack of sufficient information technology 
(IT) and biomedical informatics tools and 
platforms, as well as relevant expertise and 
methodological frameworks, account for sig- 
nificant impediments to the rapid, effective, 
and resource-efficient conduct of clinical 
research projects (Payne et al. 2010; Payne 


1 > https://www.cancer.gov/publications/dictionar- 
ies/cancer-terms/def/clinical-research (Accessed 
January 1, 2019). 
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et al. 2005; Payne et al. 2013). Compounding 
these challenges is the rapid pace of advance- 
ment in biomedical research and the resulting 
need for advances in diagnostics and thera- 
peutics that can be validated and dissemi- 
nated quickly and cost effectively (Brightling 
2017; Saad et al. 2017; Tenenbaum et al. 
2016; Weng and Kahn 2016). The conflu- 
ence of these factors has led to a number of 
major challenges and opportunities related 
to current and future CRI research and prac- 
tice. For example, the importance of mak- 
ing clinical phenotype data available for the 
secondary use in support of clinical research 
has become a competitive requirement for 
research enterprises of all sizes. Similarly, 
the increasing complexity of clinical research 
programs and the difficulty of recruiting suf- 
ficiently large patient cohorts, when combined 
with the regulatory overhead of conducting 
studies in large academic institutions, has 
led to an increase in the conduct of clinical 
studies in community practice settings. Such 
community-based research paradigms intro- 
duce new levels of complexity to the technical 
and policy aspects of data capture, manage- 
ment, and sharing plans. This rapid evolution 
and the realities of an increasingly expansive 
clinical research landscape have led investiga- 
tors and other decision makers in the health 
care and life sciences communities to call for 
increased investments in and delivery of inno- 
vative solutions to such information needs 
(P. J. Embi and Payne 2014; Pencina and Peter- 
son 2016; R. Richesson et al. 2014; Saad et al. 
2017; Tenenbaum et al. 2016). At the highest 
level, clinical research is a domain that has 
substantial information management needs, 
representing both a challenge and opportu- 
nity for biomedical informatics researchers 
and practitioners. Simultaneously, clinical 
research is an area of scientific endeavor that 
is at the forefront of attention for the govern- 
mental, academic, and private sectors, all of 
which have significant scientific and financial 
interests in the conduct and outcomes of such 
efforts. When viewed collectively, many have 
called, and continue to call, for the develop- 
ment and validation of innovative biomedi- 
cal informatics methods and tools specifically 
designed to address clinical research informa- 
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tion needs (P. J. Embi and Payne 2014; Pencina 
and Peterson 2016; R. Richesson et al. 2014; 
Saad et al. 2017; Tenenbaum et al. 2016). It 
is this overall context that has motivated an 
increasing focus on both basic and applied 
Clinical Research Informatics (CRI), which 
can be defined broadly as follows (P. J. Embi 
and Payne 2009): 
Clinical Research Informatics (CRI) is the 
sub-domain of biomedical informatics con- 
cerned with the development, evaluation and 
application of informatics theory, methods and 
systems to improve the design and conduct of 
clinical research and to disseminate the knowl- 
edge gained. 
Examples of focus areas in which CRI 
researchers and practitioners apply biomedi- 
cal informatics theories and methods can 
include the following: 
= Evaluating and modeling of clinical 
research workflow 
= Social and behavioral studies involving 
clinical research professionals and 
participants 

= Designing optimal human-computer 
interaction models for clinical research 
applications 

= Improving information capture and data 
flow in clinical research 

= Leveraging data collected in EHRs 

Optimizing site selection, investigator and 

patient recruitment 

Improving reporting to regulatory agencies 

Enhancing clinical and research data min- 

ing, integration, and analysis 

= Phenomic characterization of patients for 
cohort discovery and analytical purposes 

= Integrating research findings into individ- 
ual and population level health care 

= Defining and promoting ethical standards 
in CRI practice 

= Educating researchers, informaticians, and 
organizational leaders about CRI 

= Driving public policy around clinical and 
translational research informatics 


Building upon the preceding definitions and 
state of knowledge and practice relevant to 
CRI, in the remainder of this chapter we will 
provide an overview of the types of activities 
commonly undertaken as part of a variety 
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of representative clinical research use cases, 
introduce the role of major classes and types 
of information system that enable or facili- 
tate such activities, and conclude with a set 
analyses regarding the future directions of the 
field. The overall objective of this chapter is to 
provide the reader with the ability to evaluate 
critically the current and anticipated roles of 
biomedical informatics knowledge and prac- 
tice as applied to clinical research. 


27.2 A Primer on Clinical Research 


In the following section, we will briefly intro- 
duce the characteristics of the modern clinical 
research environment, including the design 
and execution of an exemplary class of clini- 
cal studies that are known as randomized 
controlled trials (RCTs). This primer on clini- 
cal research will serve as the context for the 
remainder of the chapter, in which we will 
introduce major information needs and their 
relationships to a variety of basic and applied 
biomedical informatics practice areas and IT 
tools/platforms. 


The Modern Clinical 
Research Environment 


27.2.1 


Clinical research comes in many forms and 
may include a variety of specific activities. 
All forms, however, share a common set of 
requirements related to the comprehensive 
management of study data — specifically, the 
collection of data on human research sub- 
jects — and analysis of those data. As clinical 
research designs span the spectrum from pas- 
sive or observational studies to interventional 
trials, the acuity of activities and associated 
data-management needs increase commensu- 
rately. For example, in a retrospective study 
subjects are selected based on the presence 
or absence of a particular condition and ret- 
rospective or pre-existing data are obtained 
from historical records (such as EHRs, disease 
registries, and research-specific databases), 
whereas in natural history studies, subjects are 
recruited and followed in prospective manner, 


with additional collection of data performed 
solely for the purposes of research, rather 
than the normal process of patient care. 

Further along the spectrum are clinical tri- 
als, in which research subjects participate in 
some additional activity, or intervention, that 
is intended either to induce a change in the 
subject or to prevent the occurrence of some 
change that would otherwise be expected. The 
intervention might be as simple as administer- 
ing a substance already found in the human 
body (such as a vitamin) to measuring a 
change in that substance (such as the amount 
of the vitamin found in the blood or urine). 
More complex studies involve interventions 
that have an impact on human disease, such 
as the administration of a preventive vaccine, 
the administration of a curative drug, or a 
surgical procedure to remove, insert, repair or 
replace a structure or device in the subject’s 
body. As with passive studies, data collec- 
tion is critical to the proper performance of 
research and may become intense, with the 
collection of clinical information occurring 
more frequently and involving data describing 
the intervention materials (such as the purity 
of a drug or the performance of a device) in 
addition to data related to the human subject 
and their response to the intervention under 
study. 

Although not an intrinsic requirement of 
clinical research, the inclusion of comparison 
groups is usually considered an important 
part of rigorous and reproducible clinical 
research method. In some cases, historical 
controls can be used for comparison with a 
group of subjects under study. For example, if 
a disease is known to have a particular fatal- 
ity rate, subjects could be given a potentially 
life-saving treatment, and their fatality rate 
can be measured and compared to past expe- 
rience. In quasi-experiments, comparison sub- 
ject groups can also be selected based on some 
known characteristic that distinguishes the 
two groups, such as gender or race, or their 
willingness to undergo a particular interven- 
tion. 

A more rigorous method of establishing 
comparison groups is through randomization, 
in which prospective subjects are assigned to 
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different groups (often referred to as study 
arms) and undergo different interventions as 
a result of the arm to which they are assigned. 
Typically, randomization might account for 
observable characteristics (such as gender, 
ethnicity, and race) to create balanced groups, 
especially where the characteristics are known 
to have some influence on the effect of the 
intended intervention. Randomization also 
serves to distribute subjects based on unob- 
served characteristics, for example, unknown 
genetic traits, in order to reduce differences in 
the groups that might bias the results of the 
study. Ina randomized controlled trial (RCT), 
one subject group will often receive a control 
intervention (for example, the usual treatment 
or treatments for a condition, or even no treat- 
ment) while one or more other groups receive 
an experimental intervention. 

Although intended to reduce bias, the ran- 
domization process itself must be carefully 
executed such that it does not introduce new 
sources of bias. For example, randomization 
can include blinding, in which the subject, 
the investigator, or both (as in double-blinded 
studies), are kept unaware of group assign- 
ment until after all assessments have been 
made. This might include the use of a placebo 
for a group receiving no treatment, in order to 
avoid the possibility that subjective improve- 
ment in a prior condition or the occurrence 
of random events (such as normally occur- 
ring illnesses), or are not ascribed to the inter- 
vention. This also may prevent subjects from 
deciding not to participate after randomiza- 
tion in a way that might unbalance the study 
groups (for example, if subjects prefer not to 
participate if they know they are not getting 
the experimental intervention) or even bias the 
assignments (for example, people less prone to 
take care of themselves might drop out if they 
find they are assigned to an intervention that 
requires a great deal of effort on their part). 

The gold standard of clinical studies is 
the double-blinded, randomized, placebo- 
controlled trial (Hulley et al. 2013). However, 
such studies may not always be practical. 
For example, the use of a placebo when an 
effective therapy is known may be unethical, 
the blinding of a surgical repair may not be 
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practical, or the condition under study may 
be so rare that only historical controls are 
available. 

While different study designs have unique 
and differentiated data, information, and 
knowledge management needs, they usually 
involve some form of systematic data man- 
agement, as noted previously. Such data man- 
agement activities usually include initial data 
collection, aggregation, analysis, and results 
dissemination, to name a few of many such 
tasks. As shown in @ Fig. 27.1, different study 
methods introduce new issues as successively 
more complex interventions and study design 
patterns are employed. For the remainder of 
this chapter, we will focus our discussion on 
RCTs as our prototypical study design, as they 
tend to involve most if not all of the informat- 
ics issues and information needs encountered 
in other study designs. Further information on 
the design characteristics, data management 
needs, and associated best practices related to 
various types of clinical trials can be found in 
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O Fig. 27.1 Overview of clinical study phases and 
associated information and data management needs. 
Underlying such design patterns are a common thread 
of systematic data management, leveraging resources 
such as health records, research-specific laboratory data, 
as well as broader knowledge collections such as the 
published biomedical literature 
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a number of excellent works concerning this 
subject (Bhatt and Mehta 2016; Hulley et al. 
2013; Prokscha 2011), and further discussion 
is beyond the scope of this chapter. 


Phased Randomized 
Controlled Trials 
Most clinical studies begin with the identifica- 
tion of a set of driving or motivating hypothe- 
ses. The research questions that serve to define 
such hypotheses might be raised through an 
analysis of gaps in knowledge as found in the 
published biomedical literature or be informed 
by the results of a previous study. It is impor- 
tant to note that clinical research endeavors 
exist on a spectrum of scientific activity that is 
often referred to as clinical and translational 
research. A particular type of translational 
research, often referred to as T1-type transla- 
tion, is a process by which basic science dis- 
coveries are used to design novel therapies 
(Sung et al. 2003). Such discoveries are then 
evaluated during clinical research studies, 
first pre-clinical and subsequent clinical trial 
phases (Payne et al. 2005). A second type of 
translational research, often referred to as T2 
translation, involves methods such as those 
borrowed from implementation science and 
clinical informatics, and focus on translating 
the findings of such clinical research stud- 
ies into common practice (Sung et al. 2003). 
A common colloquialism for this process of 
translating a novel basic science discovery 
through clinical research and into clinical 
practice is “bench to bedside” science. 
Individual and distinct RCTs are often 
conducted for different purposes, most often 
motivated by the need to fill fundamental 
knowledge gaps about a particular interven- 
tion under study. By combining such knowl- 
edge gaps with the underlying biomedical 
mechanisms of physiology and disease, a moti- 
vating hypothesis or collections of hypotheses 
are established as to why a given intervention 
might lead to a given result or finding. Such 
hypotheses result in a natural sequence of 
research questions that can be asked relative 
to a novel intervention. Usually, an individ- 
ual research study is designed to address one 
specific research question and hypothesis. In 
the case of the development and evaluation 
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of a new therapeutic intervention, like a new 

drug, an individual research study is designed 

to address each phase in a line of research 

inquiry that will determine the efficacy and 

effectiveness of such a therapy (Spilker 1984). 

In most cases, this adheres to the following 

model: 

= Phase I: Investigators evaluate the novel 
therapy in a small group of participants in 
order to assess overall safety. This safety 
assessment includes dosing levels in the 
case of non-interventional therapeutic tri- 
als, and potential side effects or adverse 
effects of the therapy. Often, Phase I trials 
of non-interventional therapies involve the 
use of normal volunteers who do not have 
the disease state targeted by the novel 
therapy. 

= Phase II: Investigators evaluate the novel 
therapy in a larger group of participants in 
order to assess the efficacy of the treat- 
ment in the targeted disease state. During 
this phase, assessment of overall safety is 
continued. 

= Phase III: Investigators evaluate the novel 
therapy in an even larger group of partici- 
pants and compare its performance to a 
reference standard which is usually the 
current standard of care for the targeted 
disease state. This phase typically employs 
an RCT design, and often a multi-center 
RCT given the numbers of variation of 
subjects that must be recruited to test the 
hypothesis. In general, this is the final 
study phase to be performed before seek- 
ing regulatory approval for the novel ther- 
apy and broader use in standard-of-care 
environments. 

= Phase IV: Investigators study the perfor- 
mance and safety of the novel therapy 
after it has been approved and marketed. 
This type of study is performed in order to 
detect long-term outcomes and effects of 
the therapy. It is often called “post-market 
surveillance” and is, in fact, not an RCT at 
all, but a less formal, observational study. 


The phase of an RCT has implications for 
the kinds of questions being asked and the 
kinds of processes carried out to answer them. 
From an informatics perspective, however, the 
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tasks are usually very similar. At a high level, 
the conduct of a Phase I, II or III clinical trial 
can be thought of in an operational sense as 
consisting of three major stages: preparatory, 
active, and dissemination. During these three 
stages, a specific temporal series of processes is 
executed. First, during the preparatory phase, 
a protocol document is generated as part of 
the project development process. The proto- 
col document usually contains background 
information, scientific goals, aims, hypotheses 
and research questions to be addressed by 
the trial. In addition, the protocol describes 
policies, procedures, and data collection or 
analysis requirements. A critical aspect of the 
protocol document is the definition of a pro- 
tocol schema, which defines at a highly granu- 
lar level the temporal sequence of tasks and 
events required to both deliver the interven- 
tion under study and to ensure that data are 
collected and managed in a systematic man- 
ner commensurate with the study hypotheses 
and aims. 

Once a protocol is deemed ready for 
execution, the feasibility of the study design 
(e.g., addressing questions such as “are there 
enough participants available in the targeted 
population to satisfy the study design defined 
in the protocol document?” is assessed either 
quantitatively (e.g., using historical data) and/ 
or heuristically). Throughout the preparatory 
phase, a concurrent process of seeking regula- 
tory approval from local and national bodies 
(e.g., local Institutional Review Boards, the 
Food and Drug Administration) occurs. Once 
a protocol plan is complete, deemed feasible, 
and regulatory approval has been received, 
potential participants are recruited and 
screened to determine if they meet the inclu- 
sion and exclusion criteria for the study (e.g., 
specific demographic and/or clinical param- 
eters required for subjects to be eligible for the 
study). Once a potential participant has been 
deemed eligible for the study, they are pro- 
vided with an informed consent document, 
which must be signed prior to proceeding 
with the enrollment process. Enrollment in the 
context of clinical trials means officially regis- 
tering as a study subject, and the subsequent 
assignment of a study-specific identifier. Once 
a person agrees to become a participant, they 
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are enrolled, and in the case of studies with 
multiple study groups or arms, randomized 
into one of those arms. 

The preceding activities lead to the initia- 
tion of the next step in the research process, 
which we refer to as the active phase. During 
the active phase, the participant receives the 
therapeutic intervention indicated by their 
study arm and is actively monitored to enable 
the collection of study-specific data. This ther- 
apeutic intervention and active monitoring 
process is often iterative, involving multiple 
cycles of interventions and active monitoring. 
Follow-up activities begin once a participant 
has completed the interventional stage of a 
study. During this stage, subjects are con- 
tacted on a specified temporal basis in order 
to collect additional data of interest, such as 
long-term treatment effects, disease status or 
survival status. 

Finally, during the dissemination phase, 
the results of the study are evaluated and for- 
malized in publications or other knowledge 
dissemination media, for translation into the 
next phase of an RCT or into clinical prac- 
tice. In some cases, such as is adaptive study 
designs (Bhatt and Mehta 2016), this dissemi- 
nation phase feeds back into the planning and 
active phases. Such feedback cycles enable 
rapid revision of a study design, iterative par- 
ticipant enrollment, and dynamic data collec- 
tion in support of such revised hypotheses and 
designs. Of note, these types of adaptive trial 
designs are particularly helpful when conduct- 
ing studies of conditions where large numbers 
of patients may not be present for recruitment 
purposes, so that patients are assigned to an 
intervention and/or monitored in a manner 
that maximizes data collection that is most 
likely to demonstrate the safety, efficacy, and 
comparative effectiveness of the diagnostic or 
therapeutic approach being evaluated (Bhatt 
and Mehta 2016). 

The quality of data produced by a clini- 
cal trial is assessed using multi-dimensional 
metrics that account for the design, execu- 
tion, analysis and dissemination of the study 
results. The quality of a clinical trial is also 
judged with respect to the significance or rel- 
evance of the reported study results within a 
clinical context (Hulley et al. 2013; Prokscha 
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2011; Spilker 1984). One key metric used to 
assess Clinical trial quality is validity, which 
can be defined both internally and externally. 
Internal validity is defined as the minimiza- 
tion of potential biases during the design and 
execution of the trial, while external validity 
is the ability to generalize study results into 
clinical care. It is important to note in a dis- 
cussion of the role of biomedical informatics 
relative to clinical research that such plat- 
forms, interventions, and methods can play 
a major role in reducing or mitigating such 
sources of bias, thus enhancing the validity 
and generalizability of study results. 


Information Needs 
and Systems in the Clinical 
Research Environment 


27.2.2 


As can be inferred by the preceding intro- 
duction to the definitional aspects of clini- 
cal research, such activities regularly involve 
a variety of data, information, and knowl- 
edge sources, as well a complicated set of 
complementary and overlapping workflows. 
At the highest level, these characteristics 
of the clinical research environment can be 
related to a number of critical information 
needs, as summarized in @ Table 27.1. This 
representation of the information needs 
inherent to clinical research is presented using 
the specific context of a prototypical RCT, but 
the basic types of needs and example solu- 
tions provided can be extended to apply to 
the broader spectrum of research designs and 
patterns introduced earlier. 

Building upon this broad definition of 
the information needs inherent to clinical 
research, in the following sub-sections we: (1) 
review the types of information systems that 
can support the phases that comprise a clinical 
study, (2) explore the functional components 
that make up a clinical trials management sys- 
tem, (3) identify current consortia that share 
clinical and research data, and (4) discuss the 
role of standards in enabling interoperability 
between such information systems. 


27.2.2.1 Information Systems 
Supporting Clinical 
Research Programs 
It is helpful to conceptualize the conduct of 
clinical research studies as a multiple-stage 
sequential model, as was introduced previ- 
ously and is expanded upon in this section 
(Payne et al. 2005). At each stage in such a 
model, a combination of research-specific 
and general technologies can be employed to 
support or address related information needs 
(B Fig. 27.2). 

There are numerous examples of general- 
purpose and clinical systems that are able to 
support the conduct of clinical research: 
= Publication (or bibliographic) databases 

and information retrieval (IR) tools such 

as PubMed, Google Scholar, and OVID 
can be used to assist in conducting the 
background research necessary for the 
preparation of protocol documents. 

= Electronic health records (EHRs) can be 
used to collect clinical data on research 
participants in a structured form that can 
reduce redundant data entry. 

= Data warehouses and associated data or 
text mining tools can be used in multiple 
capacities, including: (1) determining if 
participant cohorts who meet the study 
inclusion or exclusion criteria can be 
practically recruited given historical 
trends, and (2) identifying specific partici- 
pants and related data within existing 
databases. 

= Clinical decision-support systems (CDSS) 
can be used to alert providers at the point- 
of-care that an individual may be eligible 
for a clinical trial 


In addition to the preceding general technolo- 

gies, a number of research-specific technolo- 

gies have been developed: 

= Feasibility analysis applications and data 
simulation and visualization tools can 
streamline the pre-clinical research process 
(e.g., disease models) and assist in the 
analysis of complex data sets in order to 
assess the feasibility of a given study 
design. 
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O Table 27.1 
using the design of RCTs 
Information needs Major sub-components 


Collaborative document and 
knowledge management 


Support for research 
planning and 


conduct 
Data sources and tools for 


feasibility analyses 


Regulatory approval 
workflows 


Facilitation of data 
management, access, 
and integration 


Secondary-use of 
EHR-derived data for 
research purposes 


Research project specific 
data capture, management, 
and reporting 


Distributed data 
management (spanning 
traditional organizational 
boundaries) 


Syntactic and semantic 
interoperability 


Workforce training 
and support 


Dissemination of study, 
methodological, and technical 
training materials 


Support for team 
collaboration and 
knowledge sharing 
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Description 


Study teams often involve geographically and 
temporally distributed participants, who need to 
engage in iterative protocol development and approval 
processes. Such activities by necessity incorporate 
document versioning, annotation, and associated 
metadata management tasks. Once a protocol has been 
developed, access to data sets for the purposes of 
assessing the feasibility of a given study design is 
critical, and often involves the use of de-identified data 
sets drawn from a data warehouse or research registry. 
Finally, the submission, tracking, and documentation 
of regulatory approvals often necessitate the 
coordination and management of complex, 
document-oriented workflows and record keeping 
tasks. 


The ability to use primary clinical data from EHR or 
equivalent platforms to support secondary use in a 
research program has the potential to reduce 
redundancy and potential errors while increasing data 
quality. However, using such data in a secondary 
capacity also requires that appropriately structured 
data be captured and codified in clinical systems, and 
then be made available to research teams and research 
data management systems in a timely and resource 
efficient manner. In addition to such secondary use of 
clinical data, most clinical studies require the regular 
capture and management of study-specific data 
elements, a task that is usually accomplished via the 
use of Electronic Data Capture (EDC) or Clinical 
Trial Management Systems (CTMS). Finally, given 
the propensity to conduct studies that span traditional 
organizational boundaries in order to realize 
economies of scale andlor access sufficiently large 
patient populations, it is often necessary to query, 
integrate, and manage distributed data sets, and 
ensure their syntactic and semantic interoperability. 
Such a need is usually addressed through the use of 
Service Oriented Architectures, Cloud Computing, 
Data Warehousing, and Metadata Management 
technologies. 


A central need when conducting clinical studies is the 
ability to ensure that individuals involved in the 
execution of a protocol share common methods, data 
management practices, and workflows (thus reducing 
potential sources of study bias). Ensuring such shared 
knowledge and practices, particularly in distributed or 
multi-site settings, requires the use of distance 
education and team-science tools and platforms to 
enable knowledge sharing and distance learning 
paradigms. 


(continued) 
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Table 27.1 (continued) 
Information needs 


Management 
information capture 
and reporting 
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Participant 
recruitment tools 
and methods 


Data standards 


Workflow support 


Major sub-components 


Support for research billing 


Operational instrumentation 


and reporting 
Regulatory monitoring 


Data quality assurance 


Cohort discovery 


Eligibility determination 
and alerting 


Participant registration, 
consent, and enrollment 
execution and tracking 


Standards for interoperability 
between research systems 


Standards for 
interoperability between 
research, enterprise (e.g. 
EHR), and administrative 
systems 


Integration of tools for 
combined standard-of-care 
and research visits 


Data, information, and 


knowledge transfer between 
stakeholders, project phases, 


activities, and associated 
information systems or 
tools. 


Description 


The business and management aspects of the conduct 
of clinical studies is complex, often requiring the 
disambiguation of standard-of-care and research 
specific charges as part of billing operations, as well 
as the tracking of key performance and data quality 
metrics that may be required to satisfy contractual 
commitments to the entities funding such studies. 
Furthermore, the monitoring of study data for critical 
or sentinel events that should or must be reported for 
regulatory purposes is both necessary and of extreme 
importance. All of the aforementioned activities 
require the application of a variety of management 
information system, business intelligence, and 
reporting tools, leveraging a broad variety of 
enterprise, administrative, and study-specific data 
sources. 


The identification of participant cohorts that satisfy 
key study design criteria, such as inclusion and 
exclusion criteria, is frequently a major barrier to the 
timely and efficient execution of clinical studies. A 
variety of information needs, related to the 
identification and engagement of such cohorts, to 
point-of-care alerting regarding potential study 
eligibility, to the management of registration, consent, 
and enrollment records is inherent to this information 
need. Such requirements are usually satisfied through 
a multi-modal approach, leveraging both clinical and 
research-specific information systems. 


As has been noted relative to several of the preceding 
information needs, there is a frequent and reoccurring 
requirement for both syntactic and semantic 
interoperability between research-specific information 
systems, as well as between research-specific and 
clinical or administrative systems. Such a need 
necessitates the design, selection, and application of a 
variety of data standards, as well as the ability to map 
and harmonize between shared information models to 
support interactions between systems using a variety 
of standards. 


Much as was the case related to data standards, a 
closely aligned information need exists relative to the 
ability to support complex workflows between 
information systems and actors involved in the 
conduct of clinical research. Such workflow support 
requires both computational and application-level 
workflow orchestration, as well as the ability to define 
and apply reusable data analytic “pipelines.” 
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Table 27.1 (continued) 


Information needs Major sub-components Description 
Data, information Knowledge management for The ultimate objective of clinical research is to 
and knowledge clinical evidence generated generate and apply new evidence in support of 
dissemination during trials improvements in clinical care and human health. In 
See order to do so, it is necessary to disseminate the 
Guidelines and CDSS x 


findings generated during such studies in a variety of 


delivery mecinamisms formats, including reusablelactionable knowledge 


Publication mechanisms resources, clinical guidelines, decision support rules, 
andlor publications and reports. In addition, increasing 
Data registries emphasis is being placed on the transparency and 


reproducibility of study designs, which is often 
accomplished through the creation of public registries 
via which study data sets can be shared and made 
available to the broader biomedical community. 


Statistical Packages, Data/Text Mining Tools Results Analysis Information | 


Research Technologies General Technologies Study Activity Output 
Protocoal Authoring Tools, IR TOOIs, Data Protocol Protocol 
[ Feasibility Analysis i Warehouses, Devel t D t | 
Applications Collaboration Platforms evelopmen SSUEn 
Automated Screening EHR, CDSS, Data Participant Participant 
Tools, Targeted CDSS Warehouses Recruitment Cohort 
Intervention and/or 
EHR, Data Warehouses A 
| | E Data Collection Raw Data 
CTMS, EDC, Participant 
Tracking or Calendaring 
Tools 
Monitoring and/or 
CDSS f 
| l ml Quality Assurance Monitored Data 


f Registries, Computable Guidelines, Publication Databases Reporting Knowledge ] 


O Fig. 27.2 Overview of study activities, and related research-specific and general information technologies, as 
well as targeted products or outputs associated with the sequential clinical research workflow paradigm 


= Protocol authoring tools can allow geo- = Electronic data capture (EDC) and Clinical 


graphically distributed authors to collabo- Trial and/or Research Management Sys- 
rate on complex protocol documents. tems (CTMS/CRMS) can be used to col- 
= Automated screening tools and targeted lect research-specific data in a structured 
alerts can assist in the identification and form and reduce the need for redundant 


registration of research participants. and potentially error-prone paper-based 
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data collection techniques. More detail on 
these types of systems is provided in the 
following section of this chapter. 

= Research-specific decision support systems 
such as participant tracking or calendaring 
tools provide protocol-specific guidelines 
and alerts to researchers, for example 
tracking the status of participants to 
ensure protocol compliance. 


27.2.2.2 Clinical Research 
Management Systems 
One of the most widely used technology plat- 
forms in the clinical research domain is the 
clinical trial or research management system 
(CTMS/CRMS). Such platforms were histor- 
ically referred to as clinical trials management 
systems (CTMS), but the term CRMS is gain- 
ing popularity as such systems are increas- 
ingly used to manage the conduct of studies 
including but not limited to trials. CRMS 
platforms are usually architected as compos- 
ite systems that incorporate a number of task 
and role-specific modules intended to address 
core research-related information needs (P. J. 
Embi and Payne 2009; Johnson et al. 2016; 
Payne et al. 2005). Exemplary instances of 
such modules include the following: 
= Protocol Management components that 
support document management function- 
ality to enable the submission, version 
control, and dissemination of protocol 
related artifacts and associated metadata 
annotations. 
= Participant Screening and Registration 
tools that allow for the application of elec- 
tronic eligibility “check lists” to individual 
patients or cohorts in order to assess study 
eligibility, and when appropriate, record the 
registration and associated “baseline” data 
that are required per the study protocol. 
= Participant Calendaring functionality 
allows for the instantiation of general pro- 
tocol schemas (e.g., a definition of a proto- 
col’s temporal series of tasks, events, and 
associated data collection tasks) in a par- 
ticipant specific manner, accounting for 
complex reasoning tasks including the 
dynamic recalculation of temporal inter- 
vals between events based on actual com- 
pletion dates/times, as well as the 


“windowing” of events in which a given 
task or event is allowed to fall within a 
range of dates rather a specific, atomic 
temporal specification. 

Electronic Data Capture (EDC) compo- 
nents allow for the definition, instantia- 
tion, and use of electronic case report 
forms (e.g., forms that define study and 
task/event specific data elements to be col- 
lected in support of a given trial or research 
program). Such electronic case report 
forms (eCRFs) are the basic instrument by 
which the majority of study-specific data 
are collected and are usually populated via 
a combination of: (1) manual data entry 
(including abstraction from source docu- 
mentation such as medical records); (2) the 
importation of secondary use data from 
clinical systems; or (3) a hybrid of the two 
preceding approaches. 

Monitoring tools enable the application of 
logical rules and conditions (e.g., range- 
checking, enforcement of data comple- 
tion, etc.) using a rules engine or equivalent 
technology, in order to ensure the com- 
pleteness and quality of research related 
data. Such tools may also be used to mon- 
itor patient compliance with study sche- 
mas, as reflected in the previously described 
patient calendar functionality. 

Query and Reporting Tools support the 
planned and ad-hoc extraction and aggre- 
gation of data sets from multiple eCRFs 
or equivalent data capture instruments as 
used with the CTMS. These types of tools 
are often used by biostatisticians and other 
quantitative scientists to perform interim 
and final analyses of study results, out- 
comes, and to enable higher-order safety 
analyses. In addition, such tools may be 
employed to comply with a broad variety 
of data submission and reporting stan- 
dard set by both public- and private-sector 
entities. 

Security and Auditing functionality 
enables site, role, and study-specific access 
controls and end-user authentication/ 
authorization relative to all of the preced- 
ing functionality, as well as the ability to 
track and report upon end-user interac- 
tion with and modifications to data con- 
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tained in the CRMS. Such functionality is 
critical to enabling compliance with a 
broad variety of regulatory and privacy/ 
confidentiality frameworks that apply to 
the use of protected health information 
(PHI) for research purposes. 


In most CRMS platforms, the aforemen- 
tioned functional modules share one or more 
common research databases or in the case of 
service-oriented architectures (SOA), com- 
mon data services. In more advanced plat- 
forms, these common data structures are 
populated with research-specific and/or clini- 
cal data from enterprise systems and sources 
(such as EHRs, personal health records, and 
data warehousing platforms) via either API- 
level integration (e.g., data service publication 
and consumption) or an extract, transform, 
and load (ETL) based approach (Raths 2013). 


27.3 Data Sharing Resources 
and Networks for Clinical 
Research 


In the following section, we provide an over- 
view of the various data sharing resources and 
networks that are commonly encountered in 
the clinical research domain. These environ- 
ments are used both for the design and execu- 
tion of observational or pragmatic studies, as 
well as the conduct of retrospective analyses 
or de-novo re-analysis of data collected via 
clinical trials. Further, they also serve as a 
basis for disseminating the results and ensur- 
ing the reproducibility and rigor of a broad 
spectrum of clinical studies. 


27.3.1 Publicly Deposited Clinical 
Research Metadata and Data 


Resources 


For those seeking to share data, and to 
avail themselves of data shared by oth- 
ers, the National Center for Biotechnology 
Information (NCBI) at the NIH’s National 
Library of Medicine has created a public 
repository of individual-level data, including 
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exposure history, signs, symptoms, diagnos- 
tic test results, and genetic data. Called the 
Database of Genome and Phenome (dbGAP), 
this project provides stable data sets that 
allow multiple researchers to reference the 
same samples in their publications of sec- 
ondary analyses of the data (Mailman et al. 
2007). Additional data from clinical trials, 
currently limited to summary results, are also 
being made available by the NLM through 
the >» ClinicalTrials.gov resource, which is a 
repository of descriptive metadata related to 
historical and actively recruiting clinical trials 
(Tse et al. 2009). 


27.3.2 Clinical and Translational 
Science Award (CTSA) 
Network 


The National Center for Advancing Clinical 
and Translational Science (NCATS) has - and 
continues to — fund a national-consortium 
of academic health centers (AHC) that are 
engaged in clinical and translational research 
under the auspices of the Clinical and 
Translational Science Award (CTSA) program 
(Zhang and Patel 2006). Each member of this 
network, known as a “hub”, is responsible for 
creating a professional home that supports 
and enables the conduct of clinical and trans- 
lational research. Such support includes the 
provision of Biomedical Informatics infra- 
structure and expertise is needed to facilitate 
the capture, storage, management, and analy- 
sis of data resulting from such research efforts. 
As such, the CTSA Network provides an 
important basis for the conduct of large-scale 
programs that involve the sharing of such 
data across and between “hubs.” To coordi- 
nate and harmonize data sharing across this 
CTSA Network, NCATS has created a num- 
ber of centers and sub-networks, including 
the Accrual to Clinical Trials Network (ACT, 
described in more detail below),” the Trial 


2 > http://www.actnetwork.us (Accessed January 27, 
2019). 
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Innovation Network (TIN),3 the Recruitment 
Innovation Center (RIC),4 and the Center for 
Data 2 Health (CD2H)5 which is charged with 
the coordination of network-wide informat- 
ics activities. These centers and sub-networks 
within the broader CTSA network engage in a 
range of activities, such as: (1) leveraging local 
data repositories and federated data query 
tools to enable the rapid assessment of fea- 
sibility when designing multi-site clinical tri- 
als (ACT); (2) providing expertise and shared 
best practices as they relate to study designs 
and regulatory frameworks for such multi-site 
clinical trials (TIN); (3) delivering novel tools 
and methods to accelerate the recruitment 
of participants into those trials (RIC); and 
(4) helping the Biomedical Informatics com- 
ponents at CTSA hubs to collaborate, share 
knowledge and tools, and harmonize data 
assets (CD2H). 


27.33 i2b2 and SHRINE 


One popular warehouse software platform 
that is frequently used in the context of 
clinical research activities is Informatics for 
Integrating Biology and the Bedside (i2b2), 
developed under a National Center for 
Biomedical Computing grant from NIH to 
Partners HealthCare System and Harvard 
University. i2b2 provides an information 
system framework to allow clinical research- 
ers to use existing clinical data for discovery 
research (Murphy et al. 2010). Many of the 
over 60 institutions receiving CTSA grants 
have adopted i2b2 technologies to support 
research and collaboration. 

A companion to i2b2 is the Shared Health 
Research Information Network (SHRINE), 
which is a version of i2b2 that can pass data 
queries entered by a local user off to other 
12b2 instances to provide patient counts and 
demographic information that is summa- 


3 » https://trialinnovationnetwork.org/ (Accessed 
January 27, 2019). 

4 > https://trialinnovationnetwork.org/recruitment- 
innovation-center/ (Accessed January 27, 2019). 

5 > https://ctsa.ncats.nih.gov/cd2h/ (Accessed Janu- 


ary 27, 2019). 


rized for all remote sites (Weber et al. 2009). 
SHRINE networks have been established by 
many research institutions to support, among 
other functions, the estimation of available 
cohort sizes for multi-institutional clinical 
trials. 


27.3.4 Accrual to Clinical Trials 
(ACT) Network 


Recently, NCATS has funded the Accrual 
to Clinical Trials (ACT) network to bring 
together CTSA sites into a single SHINE net- 
work. Its intent is to allow clinical researchers 
to query the network in real time and to obtain 
aggregate counts of patients who meet clini- 
cal trial inclusion and exclusion criteria from 
sites across the United States (Visweswaran 
et al. 2018). Currently, 31 CTSA sites are 
fully operational in the ACT network, with 
an additional 15 being “staged” for integra- 
tion. Although the ACT SHRINE network 
currently returns only summary statistics for 
each site, future plans include the ability to 
obtain detailed patient data sets and, working 
with researchers are member sites, being able 
to contact individual patients for potential 
recruitment into clinical studies. 


27.3.5 PCORNet 


Another clinical research data sharing con- 
sortium is PCORNet, which is funded by 
the Patient-Centered Outcomes Research 
Institute (PCORI), a non-profit corpora- 
tion established by the Patient Protection 
and Affordable Care Act (see > Chap. 12) to 
support a national clinical research agenda 
(Fleurence et al. 2014). PCOR Net is made of 
clinical data research networks (CDR Ns) that 
each have access to EHR data on a million or 
more patients, patient-powered research net- 
works (PPRNs) that are led by patients and 
patient advocates with an interest in a particu- 
lar disease (common or rare), and Health Plan 


6 > http://www.actnetwork.us/national/get-to-know- 
46EU-1128WI.html (Accessed January 1, 2019). 
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Research Networks (HPRNs) that link EHR 
data with insurance claims data (Fleurence 
et al. 2014). There are currently 13 CDRNs, 
20 PPRNs, and 2 HPRNS nation-wide.’ The 
intent of these networks is to provide data 
that can be used directly to answer clinical 
research questions based on the health data 
of large patient cohorts. Clinical researchers 
can also use the networks to help identify and 
contact patients who might be suitable for 
clinical studies. 


27.3.6 Observational Health Data 
Sciences and Informatics 
(OHDSI) 


The Observational Health Data Sciences 
and Informatics (or OHDSI, pronounced 
“Odyssey”) program is an international net- 
work of researchers and health databases (pri- 
marily EHRs) that work together to develop 
and share data query and analytic tools that 
can operate on members’ databases. OHDSI 
grew out of the Observational Medical 
Outcomes Partnership (OMOP), which was 
a public-private partnership established in the 
US study how patient care databases could be 
used to study the effects (both beneficial and 
adverse) of medical products (Hripcsak et al. 
2015). OHDSI currently involves 106 collabo- 
rators from 23 countries on six continents.® 
Current tools include a browser-based 
data visualization tool called ACHILLES 
(Automated Characterization of Health 
Information at Large-scale Longitudinal 
Exploration System), a vocabulary brows- 
ing tool called HERMES (Health Entity 
Relationship and Metadata Exploration 
System), a predictive analytics tool called 
PLATO (Patient-Level Assessment of 
Treatment Outcomes), a cohort development 
tool called HERACLES (Health Enterprise 
Resource and Care Learning Exploration 
System) that includes analytic tools perform- 


7 » https://pcornet.org/participating-networks 
(Accessed January 1, 2019). 

8 » https://www.ohdsi.org/who-we-are/collaborators 
(Accessed January 1, 2019). 


927 


ing clinical quality metrics, and HOMER 
(Health Outcomes and Medical Effectiveness 
Research), a tool for risk identification and 
comparative effectiveness studies. 

The OHDSI Research Network allows 
researchers at member sites to query data 
repositories to obtain high-quality observa- 
tional data that can be used for study design, 
execution, and data analysis. An OHDSI 
research defines a “project” which is actual- 
ized as a query for specific patient data. Data 
owners are invited to participate in projects by 
querying their own databases and conducting 
data analysis locally, with results sent back to 
the initial research team for compilation and 
analysis. 


27.3.7. Commercial and Health Care 
Information Technology 
Vendor Networks 


In addition to the public and non-profit net- 
works, commercial entities have begun to 
appear that bring private investment resources 
to bear on some of the challenges of access 
and integration of data. For example, the 
TriNetX network (Topaloglu and Palchuk 
2018) includes over 20 academic and private 
health systems that agree to share data over 
the network in order to enable data queries 
that can be used to identify cohorts for stud- 
ies, either led by institutional investigators or 
biopharmaceutical companies. 

Another entity is Flatiron Health, a com- 
pany recently acquired by Roche (Petrone 
2018), which focuses on data from cancer 
patients. Rather than creating distributed 
queries against disparate data sets, Flatiron 
obtains data from major academic medical 
centers (currently 7°), and processes the data 
centrally, enhancing its utility through natu- 
ral language processing machine learning to 
obtain a better understanding of the course 
of patients’ conditions and their response to 
therapy. 


9 > https://flatiron.com/about-us (Accessed January 
1, 2019). 
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Electronic Health Record (EHR) vendors 
are also creating networks to support or enable 
research data sharing among their clients. For 
example, Epic Corporation began working 
with a group of academic health center cus- 
tomers in 2013 on such a data sharing network 
and created an advisory group of CRI experts 
from such sites to inform governance, infra- 
structure, and research data sharing processes. 
Similarly, Cerner corporation has created a 
centralized data sharing platform that enables 
their clients to combine and access large vol- 
umes of de-identified patient-level data for 
trial design and population health manage- 
ment purposes. 

Further, major computing and cloud 
computing vendors, such as Amazon, Apple, 
Facebook, Google, and Microsoft, have 
launched initiatives to provide patient-centered 
data sharing capabilities, which empower 
individuals to aggregate and share their own 
health care data from a variety of sources 
(“Large technology companies continue to 
ramp up healthcare forays,” 2018). These capa- 
bilities introduce new data sharing scenarios 
in which individual patients will have the abil- 
ity to “donate their data” to research projects 
without engaging their health care providers 
as intermediaries in such transactions. The 
impact of all of these activities remains to be 
understood, given their relative immaturity at 
the time this chapter is being written. 


27.3.8 All of Us 


The All of Us Research Program is the prin- 
ciple component of the US government’s 
Precision Medicine Initiative (PMI), estab- 
lished in 2016 to establish a cohort of one 
million Americans who provide health status 
data, blood, and urine specimens for clinical 
and genetic analysis, and access to their com- 
plete EHR data (Collins and Varmus 2015). 
Recruitment is currently being carried out 


primarily by 11 consortia across the country, 
involving 51 health care organizations. !° 

Unlike other consortia described above, 
data and specimens are being consolidated 
centrally at a Data and Research Center 
(DRC) and a biobank, respectively. Plans 
include the collection of data from all of the 
participants’ EHRs, not just those at recruit- 
ment centers, and the sequencing of partici- 
pants full genomes. 

The All of Us Research Program also 
differs from other consortia in its patient- 
centered focus. All of Us asks participants 
for full access to their fully identified data for 
use by researchers without prior approval of 
specific studies. In return, patients are repaid 
through full transparency and partnership. 
Participants have access to their information 
and play a role in helping to identify impor- 
tant research priorities for the program. 


27.4 Data Standards in Clinical 
Research 


The use of standards to represent clinical 
research information provides the same chal- 
lenges and benefits found in other informat- 
ics application areas. Data may be captured 
with standard terminologies or translated into 
standards to support data reporting and shar- 
ing which, in turn, require agreed-upon stan- 
dard frameworks to support such exchanges. 
Standards are even being developed for the 
representation of clinical trial protocols them- 
selves. @ Figure 27.3 depicts how the various 
kinds of standards fit into the overall schema 
of clinical research, ranging from data mod- 
els that define how data are to be represented, 
through standards for terminologies to actu- 


10 > https://allofus.nih.gov/about/program-partners/ 
health-care-provider-organizations (Accessed Janu- 
ary 1, 2019). 
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Reporting and Dissemination 


Data Exchange (Syntax and Semantics) 


HL7 Messaging HL7 FHIR CDISC 


Data Representation 


CR-Focused Terminologies General Terminologies 
and ontologies and ontologies 


Data Modeling (Logical) 
HL7 RIM(s) 


OMOP 
PCORMet 


O Fig. 27.3 Relationships among various general pur- 
pose and CRI-specific standards that are relevant to the 
design, conduct, and dissemination of clinical research 
studies. Data modeling determines how terms from ter- 
minologies and ontologies will be recorded in clinical 
research databases. Exchange standards determine how 


ally represent the data and structures for 
exchanging them, out to standards for report- 
ing and sharing. The standards described here 
are some of the current and most prevalent 
ones, but they continue to evolve and new 
standards relevant to the CRI domain are 
constantly emerging. 


27.4.1 Data Modeling Standards 


to Support Clinical Research 


Formats for data sharing typically include a 
data model for the information to be shared, 
leaving to individual contributors the later 
task of mapping local data into the exchange 
model. For example, 12b2 uses a standard 
data model internally that is based on a 
generic entity-attribute-value model (EAV) 
that essentially allows any data to be stored 
using any desired controlled terminology 
(Klann et al. 2016). This allows queries to 
be conducted across multiple i2b2 databases, 


data will map from the model to the messages used for 
interchanging the data. The use of messages is deter- 
mined by the requirements of regulatory agencies and 
collaborating research groups. See text for an explana- 
tion of acronyms used in this figure 


as with SHRINE, but consolidation of the 
data may still require mapping of data to a 
common terminology (although the ACT 
network requires using a shared ontology for 
representing data). Note that the data model 
used for storing the data in a database (i2b2, 
for example) may not be acceptable for data 
exchange. 

An alternative approach is the model- 
driven architecture, in which an underly- 
ing data model is created for the express 
purpose of representing all aspects of an 
information design, including data represen- 
tation. Previously, the models used for clinical 
research management systems have been those 
required to support system functionality. New 
efforts are underway to create standards for 
modeling the actual research protocols, to 
enable a logical representation that includes 
the semantic aspects of the protocol (for 
example, the relationships between specific 
interventions and observations intended to 
measure their effects). While use of such mod- 


27 


930 P. R. O. Payne et al. 


els may make the research process somewhat 
more complicated, the mapping to standards 
used for exchanging data becomes greatly 
simplified. 

For example, Health-Level 7 (HL7; see 
> Chap. 7) is an open standards develop- 
ment organization that develops consen- 
sus standards for all manner of clinical and 
administrative data, and is also working on 
clinical research-specific standards, such as 
the Regulated Clinical Research Information 
Management (RCRIM) model in order to 
define messages, document structures, termi- 
nology, and semantics related to the collection, 
storage, distribution, integration and analysis 
of research information (R. L. Richesson and 
Krischer 2007). The main focus of the work is 
on data related to studies involving US Food 
and Drug Administration (FDA) regulated 
products (drugs and devices). 

The Biomedical Research Integrated 
Domain Group (BRIDG) Model!! is designed 
to harmonize models from the HL7 RCRIM, 
the Clinical Data Interchange Standards 
Consortium (CDISC) a standards group 
motivated by the needs of the pharmaceuti- 
cal and bio-technology industry entities that 
sponsor or otherwise support many clinical 
studies. CDISC provides a standard for sub- 
mitting regulatory information to the FDA 
(Fridsma et al. 2008). 

In a similar manner, HL7 has created a 
standard Clinical Document Architecture 
(CDA; see > Chap. 7) that specifies the struc- 
ture and semantics of “clinical documents” 
for the purpose of data exchange. A CDA can 
contain any type of clinical content, including 
clinical notes typically found in EHRs but may 
also include case report forms from research 
studies. HL7 Fast Healthcare Interoperability 
Resources (FHIR; see >» Chap. 8) supports 
the exchange of CDA documents (Bender and 
Sartipi 2013). 

The consortia that share clinical data 
various consortia have each developed their 
own data models to cover the common ele- 
ments found in EHRs: patient demograph- 


11 » https://www.cdisc.org/standards/domain-infor- 
mation-module/bridg (Accessed January 1, 2019). 


ics, encounters, diagnoses, laboratory results, 
procedures, and medications. PCORI uses the 
PCORNet Common Data Model (Belenkaya 
et al. 2015) based on the data model used in 
the FDA’s Sentinel Initiative,!? while OHDSI 
uses the OMOP Common Data Model (Ryan 
et al. 2009). The All of Us program has 
adopted the OMOP model as well. 


27.4.2 Terminology Standards 
to Support Clinical Research 


As described previously, the design of clinical 
protocols includes rigorous attention to the 
types of data to be collected and the format 
of those data. This often involves the use of 
controlled terminologies to capture categori- 
cal data. The terminology may be as small as 
“yes/no” or a ten-point pain scale for captur- 
ing subjects’ symptoms, or it may be as vast as 
a list of all possible drugs or diseases in a sub- 
ject’s medical history. In many cases, research- 
ers will simply compose sets of terms that 
meet their immediate needs and then require 
all investigators participating in the study to 
apply them consistently. 

Because the terms used in clinical research 
are often identical to those used in clini- 
cal care, standard multi-use terminologies 
(such as those described in » Chap. 7) are 
often appropriate for use in capturing clini- 
cal research data. However, there are some 
aspects of clinical research that are not well 
represented in mainstream terminologies; and 
in these cases, terminologies and their richer 
forms, ontologies, that are more focused on 
clinical research, are required. In particular, 
clinical research data and workflow models 
require controlled terminologies and ontolo- 
gies that define domain-specific concepts and 
standard common data elements (CDEs). 
Collections of standard terms for CDEs can 
be found in the NIH’s Common Data Element 
server (Rubinstein and McInnes 2015).!? Ina 


12 > https://www.sentinelinitiative.org (Accessed Jan- 
uary 1, 2019). 

13 » https://www.nlm.nih.gov/cde (Accessed January 
1, 2019). 


Clinical Research Informatics 


similar manner, the Ontology for Biomedical 
Investigations (OBI) (Bandrowski et al. 2016) 
has been developed by a consortium of rep- 
resentatives from across the spectrum of 
biomedical research, and includes terms to 
represent the design of protocols and data col- 
lection methods, as well as the types of data 
obtained and the analyses performed on them. 

There are several reasons for considering 
the use of standard controlled terminologies in 
the capture of clinical research data. One rea- 
son is to take advantage of clinical data that 
are already being collected on research sub- 
jects for other purposes. A common example 
is the use of data on morbidity and mortal- 
ity that are collected using one of the various 
versions and derivatives of the International 
Classification of Diseases (ICD; see > Chap. 
7). In the US, for example, patient diagnoses 
are reported for billing purposes using the 
Clinical Modifications of the tent edition of 
ICD (ICD-10-CM). While such coded infor- 
mation is readily available, researchers repeat- 
edly find that ICD-10-CM codes assigned 
to patient records have an undesired level of 
reliability or granularity, especially when com- 
pared to with the actual content of the records 
(Topaz et al. 2013). Thus, the convenience of 
using such standard codes may be outweighed 
by the imprecision, which can adversely affect 
study design and analytical results. 

A second reason for adopting a standard 
controlled terminology is simply to avoid 
“reinventing the wheel.” As is described in 
> Chap. 7, a great deal of effort has been 
expended in the creation of domain-spe- 
cific terminologies that are comprehensive, 
unambiguous, and maintained over time. 
Designating such terminologies for use in a 
protocol design can relieve researchers of hav- 
ing to worry about the quality of the termi- 
nology. For example, a researcher is unlikely 
to encounter novel concepts when recording 
subjects’ demographic data, such as gender, 
marital status, religion, and race. Specifying, 
for example, that ISO standards should be 
used for these data elements greatly simplifies 
the protocol-design process. 

A third reason for choosing standard ter- 
minologies relates to the ability to compare 
data collected in one study with those collected 
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in others. For example, the use of a standard 
scale for recording a subject’s pain will allow 
comparison of results from a study of one 
treatment with those from a second study of 
another treatment. The selection of an appro- 
priate standard for a particular purpose is not 
straightforward (for example, the NIH Pain 
Consortium lists six different scales!*). The 
choice may be determined simply based on 
the emerging popularity of one terminology 
over another in a wide community of those 
investigating similar problems. PCORNet, 
OHDSI, and All of Us each specify the use of 
terminologies such as ICD, SNOMED, CPT, 
and LOINC (See » Chap. 7). 

A fourth use of standard terminologies 
relates to reporting requirements. Government 
agencies sometimes require the reporting 
of clinical research data and, when they do, 
often require certain data to be reported using 
a particular standard. For example, the FDA 
requires the use of the Medical Dictionary for 
Regulatory Activities (MedDRA) for report- 
ing all adverse events occurring in drug trials 
(Brown et al. 1999), while the Cancer Therapy 
Evaluation Program (CTEP) at the National 
Cancer Institute (NCI) requires the use of 
Common Terminology Criteria for Adverse 
Events (CTCAE) (Colevas and Setser 2004). 
In an analogous manner, at the international 
level, the World Health Organization requires 
the use of the Adverse Reactions Terminology 
(WHO-ART).!> Faced with such reporting 
requirements, researchers sometimes choose 
to record data in these terminologies as they 
are being captured. In those cases where the 
clinical questions being answered require 
more detailed data, however, researchers must 
resort to recording data with some other stan- 
dard (such as SNOMED; see > Chap. 7), or a 
controlled terminology of their own creation, 
and then translating them to the terminol- 
ogy or terminologies required for reporting 
purposes. 


14 » http://nationalpainreport.com/how-to-measure- 
chronic-pain-8812496.html (Accessed January 1, 
2019). 

15 » https://www.who-umc.org/vigibase/services/ 
learn-more-about-who-art (Accessed January 1, 
2019). 
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27.4.3 Clinical Research Reporting 
Requirements 


Requirements for reporting research data, 
particularly those related to outcomes and 
adverse events, are generally accompanied by 
specifications for the format of the data being 
reported. For example, the FDA’s Center 
for Drug Evaluation and Research (CDER) 
accepts reports using the HL7 Individual 
Case Safety Report, while the NCI’s CTEP 
allows submission of adverse event infor- 
mation to its Cancer Therapy Evaluation 
Program Adverse Event Reporting System 
(CTEP-AERS'®) either manually, using a 
Web-based application, or electronically via 
a web-services API. As mentioned earlier in 
this section, these agencies require that data 
be coded with standard terminologies, such as 
MedDRA and CTCAE, respectively. 

Several reporting requirements have 
emerged for the purpose of making clinical 
trial results publicly available, both to support 
reuse of the data by researchers and as infor- 
mation sources for patients and their families. 
In 2000, the US National Library of Medicine 
launched » ClinicalTrials.gov to provide a 
mechanism for researchers to voluntarily reg- 
ister their trials so that those interested in par- 
ticipating as research subjects can identify, via 
the World Wide Web, studies relevant to their 
condition. » ClinicalTrials.gov currently 
includes information from over 300,000 trials 
from over 207 countries. In 2004, the European 
Union initiated a similar effort, called the 
European Union Drug Regulating Authorities 
Clinical Trials (EudraCT). » ClinicalTrials. 
gov and EudraCT also support the report- 
ing of the clinical trials results. While the 
submissions are nominally voluntary, federal 
agencies often mandate the reporting as a 
requirement for obtaining research funds or 
to obtain approval for regulated drugs and 
devices. In the US, for example, the Food 
and Drug Administration Amendments Act 
of 2007 (FDAAA) strongly reinforced these 


16 » https://ctep.cancer.gov/protocolDevelopment/ 
electronic_applications/adverse_events.htm 
(Accessed January 1, 2019). 


requirements. In addition, the over 4829 peer- 
reviewed biomedical journals that participate 
in the International Consortium of Medical 
Journal Editors (ICJME) now require public, 
prospective registration in > ClincialTrials. 
gov or similar databases of clinical trials of 
all interventions (including devices) in order 
for resultant manuscripts to be considered for 
publication.!7 

Each repository has defined its own 
mechanisms for transmitting protocol data. 
> ClinicalTrials.gov, for example, allows 
investigators to enter their data through an 
interactive Web site or to upload data in a 
defined XML (eXtensible Markup Language) 
format (see > Chap. 7). Clinical research data 
management systems that can export their 
study in this format can save the researcher 
much manual effort and assure accurate data 
entry (Zarin et al. 2011). 


27.5 CRI and the COVID-19 
Pandemic 


The emergence in 2020 of the COVID-19 pan- 
demic has raised many biomedical and health 
issues on which informatics can have a major 
impact. Novel challenges include alteration of 
data collection functions of EHRs (» Chap. 
14) and telemedicine (> Chap. 20) to support 
the needs of patient care and research func- 
tions. New data to be captured with these 
technologies will need to be met with advance- 
ments in data sharing and research analytics. 
EHRs will need to be easily modifiable to 
capture new kinds of data when needed for 
patient care and for research. COVID-19 and 
other recent epidemics have demonstrated that 
patient travel and contact information need to 
be incorporated into the record for risk assess- 
ment (e.g., intensity of social contact; Meinert 
et al. 2020). Such data, in turn, can be used 
to study epidemiologic patterns across patient 
populations (e.g., developing predictive risk 
scores for disease exposure; Liu et al. 2020). 


17 > http://www.icmje.org/journals-following-the- 
icmje-recommendations (Accessed January 1, 2019). 
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The value of telemedicine for bringing 
medical expertise to patients located in an 
area where expertise is lacking is well demon- 
strated. In a pandemic, such as the COVID- 
19 outbreak, the patients are by definition 
located almost everywhere, while expertise 
in this new condition is limited to a relatively 
small number of academic and government 
institutions. Perhaps as never before, telemed- 
icine is also needed to protect the clinicians 
and other caregivers who must minimize con- 
tact with contagious patients. The technolo- 
gies of telemedicine are also being exploited 
for “tele-research” where, again, potential 
research subjects are widely dispersed and 
hard to reach and at the same time pose a 
potential danger to the researchers. COVID- 
19 researchers are drawing lessons on research 
at a distance, including effective recruitment 
and consent, from previous work in HIV 
(Mgbako et al. 2020) and geriatric research 
(Nicol et al. 2020). 

The rapid expansion of data collection in 
response to new domains and data types often 
outstrips the ability of standards organizations 
to keep up. This poses challenges for the shar- 
ing, aggregation, and analysis of data for sur- 
veillance, understanding natural history of the 
disease, and patient recruitment. In COVID-19, 
new data types and domains have outstripped 
the standards development organization’s abil- 
ity to keep up. As a result, consortia such as 
the the ACT network (see Sect. 29.3.4) and 
CD2H consortium in the CTSA network (see 
Sect. 29.3.2), both sponsored by NCATS, as 
well as the NIH-sponsored All of Us network 
(sees Sect. 29.3.8) have had to develop their 
own criteria for inferring which patients in 
EHR databases have COVID-19.!8 For ACT, 
the criteria support cross-institutional queries 
to obtain summary results that can be used for 
patient enrollment.!? All of Us is ramping up 
data submissions from healthcare organiza- 
tions to obtain more timely data on their par- 


18 » https://allofus.nih.gov/news-events-and-media/ 
announcements/coronavirus-update-all-us (Last 
accessed June 2, 2020). 

19 > https://ncats.nih.gov/pubs/features/ctsa-act (Last 
accessed June 2, 2020). 
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ticipants who are afflicted with COVID-19. 
CD2H has established the National Clinical 
COVID Collaboratory (N3C) to create a cen- 
trally sponsored “data enclave”, composed of 
de-identified COVID-19 patient records from 
CTSA-sponsored institutions that can be used 
directly for data analysis (see @ Fig. 27.4). 

COVID-19 research has placed new 
demands for analytic support. In part, this is 
due to the perceived urgency to find answers 
to epidemiologic, preventive, diagnostic, prog- 
nostic and therapeutic questions. Another 
issue is diversity of data sharing and harmo- 
nization efforts that exist at the regional and 
national levels (described above). Researchers 
are overwhelmed with trying to understand 
how to navigate various resources, their con- 
tents, and associated regulatory constraints, 
to find the best “fit” given a driving problem 
or hypothesis (see B Fig. 27.5). 


27.6 Future Directions for CRI 


As the preceding sections illustrate, significant 
progress has been made to advance the state 
of the CRI domain, and such advances have 
already begun to enable significant improve- 
ments in the quality and efficiency of clini- 
cal research. These advances can be viewed 
as having been achieved at the individual 
investigator level (e.g., improvements in pro- 
tocol development, study design, participant 
recruitment, etc.), through approaches and 
resources developed and implemented at the 
institutional level (e.g., development of meth- 
ods and resources in data warehousing that 
enable storage and retrieval of clinical data 
for research, development of novel clinical tri- 
als management systems, etc.), and through 
mechanisms that have enabled and facilitated 
the endeavors multi-center research consortia 
to drive team science (e.g., innovations that 
enable data management and interchange for 
multi-center studies) (Bourne et al. 2015; P. J. 
Embi et al. 2019; Payne et al. 2018; Sanchez- 
Pinto et al. 2017; Smoyer et al. 2016). 


20 » https://ncats.nih.gov/n3c (Last accessed June 2, 
2020). 
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NATIONAL COVID 
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There is no shortage of clinical data within institutions: however, in the United States these data are not 
WHY JOIN N3C  structuredthe ra way nor are they accessible for shared analytics by our nation’s scientists. It is 

imperative that we overcome these technical and regulatory barriers to address the COVID-19 pandemic 
The N3C aims to unite COVID-19 data, enabling innovative machine learning and statistical analyses that require a large amount of 
data—more than is available in any given institution. The goal is to enable rapid collaboration among clinicians, researchers, and data scien- 
tists to identify treatments, specialize care, and to reduce the overall severity of COVID-19. Visit covid.cd2h.org/join to learn more. 
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O Fig. 27.4 The National COVID Cohort Collabora- 
tory is a project at the National Institute for the 
Advancement of Translational Science (NCATS) which 
is pooling electronic data on COVID-19 patients in 
order to provide researchers with access to data on 
patient sets that are larger than those available in a sin- 
gle institution. Efforts include data acquisition and har- 


monization, creation of a centralized data enclave to 
support collaborative analytics, and development of a 
synthetic data set based on actual patient data, that can 
be downloaded for analysis but poses no risk of reiden- 
tification. (See > https://covid.cd2h.org. Courtesy 
of the U.S. National Library of Medicine) 
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O Fig. 27.5 The emergence of the COVID-19 pan- 
demic has engendered many efforts to share clinical and 
epidemiologic data in order to rapidly learn patterns of 
natural history, predictions of risk for exposure and 


Looking towards the future of the field of 
CRI, a number of national- and international- 
level trends are introducing new or evolved 
challenges and opportunities, including: 
= Nearly ubiquitous adoption of EHRs at 

the national level, as well as the availability 

of lightweight and scalable data-level and 
platform-independent interoperability 
standards are making it possible to build 
and leverage learning health care systems 
at-scale, in which every patient encounter 
is an opportunity to learn and improve 
that patients care as well as the care deliv- 
ered to broader populations (Friedman 
et al. 2010). However, while such systems 
do enable the pragmatic collection and 
integration of increasingly large volumes 
of clinical phenotype data, these “real 
world” data also exhibit new types of chal- 
lenges in terms of scope, completeness, 
quality, and domain coverage. As such, 
new methods are needed to enable the 
characterization and analysis of such data 
in a rigorous and reproducible manner. 

= Mobile, wearable, and other sensor tech- 
nologies, in conjunction with new mecha- 
nisms for collecting patient reported 
outcome (PRO) data, are allowing for the 
conduct of clinical research studies that 


manual abstraction and reporting; (C) platform-specific and/or standards-based 
interfaces or data aggregation workflows; (D) ETL of data from source systems to shared repositories or data models 
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severe disease, and therapeutic outcomes. Efforts include 
local sharing in patient care networks as well as national 
coalitions established to support both patient care and 
research 


incorporate data from beyond the clinic 
and hospital settings. However, the tempo- 
rality, granularity, and reliability of such 
data is vastly different from that collected 
via more “traditional” mechanisms as have 
been introduced in this chapter. As a 
result, important questions are now being 
raised concerning how such emergent data 
sources can/should be integrated with 
those “traditional” data types, and further, 
the methods that are appropriate when 
seeking to identify meaningful “signals” 
within ensuing multi-scale and complex 
data sets. 

= The ready availability of artificial intelli- 
gence (AI) platforms and methods (e.g., 
knowledge-based systems, cognitive com- 
puting, high-throughput machine learn- 
ing, deep learning, etc.) are making it 
possible to perform analyses across and 
between data scales, for example, identify- 
ing meaningful patterns that incorporate 
genomic, clinical, demographic, social, 
and environmental measures. Such multi- 
scale reasoning is essential to the genera- 
tion of evidence in support of precision 
medicine and/or health paradigms, but 
also introduce critical questions concern- 
ing how to design and execute such 
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O Fig.27.6 Aspirational 
integrated and high 
performing healthcare 
research and delivery 
system model, enabled via 
emergent CRI frameworks, 
methods, and technologies 
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data-intensive studies, again emphasizing 
reproducibility and rigor as was the case in 
the two preceding trends. 


As CRI capabilities and systems continue to 
evolve, an ability to assess the maturity of 
such systems and environments will become 
increasingly important. As with maturity 
assessments of EHR deployments and their 
use across health systems, so too will maturity 
and deployment models emerge for measur- 
ing the deployment and use of CRI systems 
to enable robust research enterprises (Knosp 
et al. 2017; Pettit 2013). 

One key aspect of system maturity will 
be a mature workforce of CRI professionals 
and leaders. Such a workforce will become 
increasingly important to the successful 
deployment, management, and optimal use 
of CRI systems. Examples of such leaders, 
like the Chief Research Information Officer 
(CRIO) role first established in 2010, will con- 
tinue to become common and as essential to 
the functioning of mature clinical and transla- 
tional research enterprises as are their health 
IT (e.g., CIO) and clinical informatics (e.g., 
CMIO) leader counterparts (Sanchez-Pinto 
et al. 2017). 

When viewed collectively, these emergent 
challenges and opportunities help to define a 
rapidly evolving CRI environment, in which 
we have the ability to create integrated and 
high performing health care research and 


Integrated and High 


Healthcare Research and 


+ New data types 
+ System-level thinking 
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delivery systems (@ Fig. 27.6), driven by 

rapid translation between and across: 

= Functional learning health care systems in 
which we instrument the clinical environ- 
ment to generate large-scale and pragmatic 
data, generate hypothesis to be testing in 
view of such data, and then evaluate the 
impact of ensuing data-driven interven- 
tion in such “real word settings.” 

= Precision health frameworks wherein the 
translation between research-generated 
evidence and practice is both rapid and 
cyclical, acting upon multi-scale data, and 
using contemporary analysis and under- 
standing methods, such as those being 
made available through advances in artifi- 
cial intelligence (AI). 

= Big data resources, generated via the pre- 
ceding areas and incorporating new or 
emergent data types, leveraging systems- 
level thinking (e.g., across scales) and 
employing data science methodologies to 
identify important signals or motifs in 
such high volume, velocity, and variability 
data. 


27.7 Conclusion 


This chapter has sought to introduce the 
following major themes: (1) design charac- 
teristics that serve to define contemporary 
clinical studies; (2) foundational information 
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needs inherent to clinical research programs 
and the types of information systems can be 
used to address or satisfy such requirements; 
(3) the role of multi-purpose platforms, such 
as Electronic Health Record (EHR) sys- 
tems, that can be leveraged to enable clinical 
research programs; (4) the role of standards 
in supporting interoperability across and 
between actors and entities involved in clinical 
research activities; and (5) future directions 
for the CRI domain and how such endeavors 
may alter or optimize the conduct of clinical 
research. As we have explained, the clinical 
research environment is data, information, 
and knowledge intensive, thus calling for the 
application of biomedical informatics theo- 
ries and methods. This set of features explains 
and justifies CRI’s emergence as a distinct and 
highly valued sub-discipline of the broader 
field of biomedical informatics. Part of the 
evolution of CRI can be attributed to the 
extraordinary increase in the scope and pace 
of clinical and translational science research 
and development that has been catalyzed by 
a variety of funding and policy initiatives that 
seek to re-engineer the way in which govern- 
mental, public, and private entities advance 
basic science discoveries into practical thera- 
pies. Such evolution is further bolstered by 
technical and environmental changes, such 
as the advent of learning health care systems, 
the availability of new and novel data cap- 
ture/generation mechanisms, and advances 
in our ability to analyze and understand “big 
data.” As such, CRI has accordingly become 
a dynamic and relevant sub-domain of bio- 
medical informatics knowledge and practice, 
providing a broad spectrum of research and 
development opportunities in context of both 
basic and applied informatics science. 


(e) Suggested Readings 

Embi, P. J., & Payne, P. R. (2009). Clinical research 
informatics: Challenges, opportunities and 
definition for an emerging domain. Journal of 
the American Medical Informatics 
Association, 16(3), 316-327. This report 
defines the field of biomedical informatics 
knowledge and practice that applies to the 
design and conduct of clinical studies. Further, 
it presents a framework for the alignment of 
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general purpose and research-specific technol- 
ogy. 

Harris, P. A., Taylor, R., Minor, B. L., Elliott, V., 
Fernandez, M., O’Neal, L., et al. (2019). The 
REDCap consortium: Building an interna- 
tional community of software platform part- 
ners. Journal of Biomedical Informatics, 95, 
103208. REDCap is one of the most common 
and widely adopted electronic data capture 
platforms used in the clinical research domain. 
This report describes the structure and func- 
tion of the REDCap consortium, which has 
led the development and dissemination of this 
ubiquitous clinical research data management 
tool. 

Hersh, W. R., Weiner, M. G., Embi, P. J., Logan, 
J. R., Payne, P. R., Bernstam, E. V., et al. 
(2013). Caveats for the use of operational elec- 
tronic health record data in comparative effec- 
tiveness research. Medical Care, 51(8 0 3), 
S30. This report introduces practical issues to 
consider when utilizing data from electronic 
health records to support and enable clinical 
research. It also presents a series of critical 
questions to be asked and answered when 
designing and executing such studies. 

Hripcsak, G., Shang, N., Peissig, P. L., Rasmussen, 
L. V., Liu, C., Benoit, B., Carroll, R. J., Carrell, 
D. S., Denny, J. C., Dikilitas, O., & Gainer, 
V. S. (2019). Facilitating phenotype transfer 
using a common data model. Journal of 
Biomedical Informatics, 96, 103253. This 
report outlines the role of common data mod- 
els in facilitating the systematic and reproduc- 
ible phenotyping of individual patients as well 
as populations. In addition, the report pro- 
vides a comparative assessment of existing 
data models and their utility for computa- 
tional phenotyping and resultant data analy- 
ses. 

Payne, P. R., Johnson, S. B., Starren, J. B., Tilson, 
H. H., & Dowdy, D. (2005). Breaking the 
translational barriers: The value of integrat- 
ing biomedical informatics and translational 
research. Journal of Investigative Medicine, 
53(4), 192-201. This report describes the criti- 
cal role of biomedical informatics theories 
and methods in overcoming the T1 and T2 
clinical and translational barriers. It also pro- 
vides a conceptual model for the alignment of 
such capabilities with the spectrum of activi- 
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ties that make up the clinical and translational 
research “lifecycle.”. 

Tenenbaum, J. D., Avillach, P., Benham-Hutchins, 
M., Breitenstein, M. K., Crowgey, E. L., 
Hoffman, M. A., et al. (2016). An informatics 
research agenda to support precision medi- 
cine: Seven key areas. Journal of the American 
Medical Informatics Association, 23(4), 791- 
795. This perspective introduces an agenda for 
informatics research and practice in the con- 
text of precision medicine. The authors 
describe multiple axes of how supporting the- 
oretical frameworks and applied methods can 
generate and deliver evidence in support of 
precision risk management, diagnosis, and 
treatment planning. 

Weng, C., Shah, N., & Hripcsak, G. (2020). Deep 
phenotyping: Embracing complexity and tem- 
porality— Towards scalability, portability, and 
interoperability. Journal of Biomedical 
Informatics. As clinical and translational 
research becomes increasingly multi- 
institutional and involves the sharing of deep 
phenotypes across and between traditional 
organizational boundaries, there is a need for 
common tools and methods to derive and rep- 
resent such constructs. This report outlines 
the current state-of-the-art in terms of com- 
putational phenotyping methods and inter- 
change standards. 


@ Questions for Discussion 

1. How do the foundational information 
needs of clinical research differ 
depending on the type and phase of 
study being undertaken? Do study 
phases have an impact on the primacy 
of such information needs? 

2. What is the role of biomedical informat- 
ics with regard to decreasing bias in 
RCTs and thus enhancing the internal 
validity, external validity, and generaliz- 
ability of study results? 

3. How can clinical or general-purpose 
information systems and research- 
specific tools be employed synergisti- 
cally to address clinical research-specific 
information needs, such as participant 
recruitment or the population of study- 
specific data capture instruments? 


4. How do the core functional components 
of common clinical trial management 
systems (CTMS) overlap with or other- 
wise replicate the functionality of elec- 
tronic health record (EHR) systems? To 
what extent does this similarity or differ- 
ence inform the need for syntactic and/ 
or semantic interoperability among such 
systems? 

5. In what situations is the use of clinical 
research-specific terminologies or ontol- 
ogies appropriate? In such situations, 
what challenges exists relative to the 
selection, use, and maintenance of 
appropriate standards? 

6. What is the role of data standards in 
enabling the dissemination and reuse of 
study-generated data sets? How can the 
use of such standards enable the cross- 
linkage or integrative analysis of data 
sets derived from multiple but indepen- 
dent studies? 

7. Compare and contrast the future 
directions of CRI with those of other 
BMI sub-disciplines and focus areas 
described in this book. To what extent 
are they similar and different, and 
what are the implications of such 
findings relative to the role of common 
informatics theories and methods and 
their applicability to the clinical 
research domain? 
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© Learning Objectives 
After reading this chapter, you should know 
the answers to these questions: 
= What is precision medicine, and how 
does it differ from traditional medical 
practice? 
= How can Electronic Health Records 
(EHRs) be used to advance precision 
medicine discovery? 
= How to identify and evaluate pheno- 
types algorithms from EHRs? 
= How are EHRs aiding in implementa- 
tion of precision medicine? 
= How are genomic data being used today 
in research, clinical care, and consumer 
health? 
What is Mendelian randomization? 
What are some of the large cohorts 
being used to advance precision medi- 
cine, and what is the importance of 
diversity in these cohorts to accelerate 
discovery? 


28.1 What Is Precision Medicine? 


Precision medicine is a field focused on under- 
standing the role molecular, environmental, 
and phenomic variation play in healthcare 
with the goals of more rational therapeu- 
tics, Improved understandings of progno- 
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sis, and more optimal healthcare delivery. 
Fundamentally, precision medicine focuses 
on data-driven optimization of health care. 
This includes more precise understanding 
of disease through the amassing of both 
large research studies and amalgamations of 
truly massive amounts of routinely collected 
healthcare information from electronic health 
records. In addition, precision medicine is 
leveraging new technologies such as sensor- 
based measurements and omic technolo- 
gies. The most commonly used technology is 
genomic assays, such as genotyping (which 
uses probes to assay large numbers of spe- 
cific variants) or genomic sequencing (which 
assess each base pair present within a region 
or across the genome). However, epigenomics, 
transcriptomics, proteomics, microbiomes, 
and metabolomics are also being used and 
hold promise for further research and clinical 
care in precision medicine. The rapid growth 
and availability of other technologies such as 
sensors and imaging data will also contrib- 
ute. The ultimate goals are to use all of these 
data to better guide diagnosis, improve under- 
standings of prognosis, optimize existing 
treatments, develop new therapies, and design 
novel prevention schemes (@ Fig. 28.1). 
Precision medicine as a field is closely 
related to personalized medicine, P4 medicine, 
individualized medicine, genomic medicine, 
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and other similar recent fields to all share as a 
goal improvement of health for both individ- 
ual patients and populations through use of 
data. Indeed, in practice, most of these terms 
are used fairly interchangeably amongst most 
institutions focusing on discovery and imple- 
mentation of precision medicine approaches. 
A key differentiator is that precision medicine 
focuses primarily on its goals through 
enhanced understandings and new taxono- 
mies that lead to redefinition of the diseases 
themselves. Precision medicine shares with 
these other terms the centuries-old goal of 
individualizing care for individual patients, 
following Sir William Osler’s maxim: “The 
good physician treats the disease; the great 
physician treats the patient who has the dis- 
ease.” The primary difference between now 
and then is the deluge of new modalities and 
quantities of information truly allowing us to 
leverage big data to solve problems of indi- 
vidual health in new ways. Such scale of huge 
data sets is needed, for instance, to untangle 
the contribution of rare genomic variation to 
the clinical impact on individual patients. For 
example, out of 6.4 billion base pairs in the 
(diploid) human genome, an average patient 
genome may have 4.1 million to 5 million 
genetic variants (1000 Genomes Project 
Consortium et al., 2015). Distilling the impact 
of these variants in such a way that clinicians 
can make use of this information is a substan- 
tial challenge for both rare and common dis- 
orders. 

In this chapter, we will focus on the infor- 
matics resources for and implications of preci- 
sion medicine. We will also provide a basic 
overview of precision medicine. Informatics is 
needed for the capture of data, transforma- 
tion of data into information and knowledge, 
and — perhaps most importantly — implemen- 
tation of precision medicine — acting on that 
new knowledge to improve the health of real 
people. Electronic health record (EHRs) are 
an important part of precision medicine both 
by providing a real world data source as well 
as a key modality for its implementation. 
Relevant other chapters to this chapter include 
> Chaps. 9 and 26. 


28.2 Using EHRs for Genomic 
Discovery 


Electronic health records (EHRs) have been 
an important part of clinical care for decades, 
but over the last decade have become an 
increasing part of discovery to advance preci- 
sion medicine. While retrospective epidemio- 
logical research has been performed using 
claims data for decades, the more recent use 
of electronic health records for molecular 
research, especially genomics, over the last 
decade has become transformational. EHRs 
contain a wealth of dense patient data that are 
valuable for discovery, and that would cost 
significant sums to reproduce in a research 
cohort (@ Fig. 28.2). As such, they have 
become a valuable source of information for 
retrospective research (Robinson, Wei, Roden, 
& Denny, 2018). 

The first successful use of electronic health 
records for genetic discovery was in 2010. 
Four papers published that year used EHR 
data as the sole phenotypic information to 
replicate known genetic effects (Denny, 
Ritchie, Basford, et al., 2010; Denny, Ritchie, 
Crawford, et al., 2010; Kullo, Ding, Jouni, 
Smith, & Chute, 2010; Ritchie et al., 2010). 
Ritchie et al. tested for the associations 
between 5 diseases and 21 SNPs that were 
known to be associated from prior literature 
and replicated all associations for which the 
study was adequately powered (Ritchie et al., 
2010). The other studies added replications 
with endophenotypes of cardiac conduction 
and red blood cell traits. These studies 
included the first EHR-based genome-wide 
association studies (GWAS; @ Fig. 28.3a) — 
studies which test for association between a 
phenotype and hundreds of thousands to mil- 
lions of single nucleotide polymorphisms 
(SNPs). GWAS is discussed more in » Sect. 
28.4.1. Importantly, these studies suggested a 
process for developing phenotype algorithms 
that identified both cases and controls as two 
separate groups. Each algorithm used a vari- 
ety of types of EHR data to identify their 
populations with high positive predictive val- 
ues (analytics for phenotype algorithms are 
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O Fig. 28.2 Density of phenotypic data in an EHR 
linked biobank. All rows represent data from patients 
that enrolled in BioVU, the Vanderbilt DNA biobank. 
The first patiented enroll ed in 2007. This figure demon- 
strates that individuals can have decades of extant EHR 


discussed in more detail in » Sect. 28.3 
below). Another important early study vali- 
dated known SNPs associated with rheuma- 
toid arthritis using a validated EHR algorithm 
(Kurreeman et al., 2011). This study demon- 
strated the effect sizes from the EHR algo- 
rithms and from prior research studies were 
similar. 

Another advance highlighted first in early 
EHR studies was the phenome-wide associa- 
tion study (PheWAS; @ Fig. 28.3b), which is 
discussed in » Sect. 28.4.3 below. That the 
first systematic “PheWAS” was performed in 
EHRs was enabled by the broad collection of 
phenotypes found in EHRs that is essentially 
unrelated to an a priori research hypothesis 
(Denny, Ritchie, Basford, et al., 2010). 
PheWAS approaches have been applied in 
observational cohorts as well (Millard et al., 
2015; Sarah A Pendergrass et al., 2013). 

An important development in the use of 
EHRs for precision medicine was the National 
Human Genome Research Institute-funded 
Electronic Medical Records and Genomics 


BioVU database 


945 


Total counts 


248,455 


25%) 40.0 million 


RR 


2000 2005 


data prior to their enrollment, allowing both cross- 
sectional and in silico longitudinal studies. Each point 
represents real data transformed by square root divided 
by 20 of the actual count at that time period. (From 
Robinson et al., 2018) 


(eMERGE) network, which began in 2007. 
eMERGE had the explicit goal of exploring 
the capabilities of EHRs for genomic discov- 
ery. The initial eMERGE network had five sites 
and has been renewed twice, now in its third 
iteration, with 10 sites. In subsequent itera- 
tions, eMERGE has grown from a primary 
focus of discovery to also include implementa- 
tion of genomic medicine (Rasmussen-Torvik 
et al., 2014). The first novel discovery out of 
eMERGE was in 2011 by identifying vari- 
ants in FOXEI associated with autoimmune 
hypothyroidism (some of the results from this 
study are shown in @ Fig. 28.3) (Denny et al., 
2011). Since then, eMERGE sites have inves- 
tigated nearly 100 different phenotypes, many 
with novel discoveries (Crawford et al., 2014). 


28.3 Finding Research-Grade 
Phenotypes in EHRs 


A characteristic of EHR phenotypes is the 
combination of multiple modalities of the 
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O Fig. 28.3 Example GWAS and PheWAS using EHR 
data. a Manhattan plot from a GWAS of autoimmune 
hypothyroidism performed in the eMERGE Network. 
This GWAS identified variants in FOXE/ as a risk fac- 
tor for autoimmune hypothyroidisms, a novel finding at 
the time. b PheWAS of the variant identified in panel 


EHR information to identify high-quality 
research grade phenotypes. Most frequently, 
they include billing codes, medications, labo- 
ratory results, and some sort of text process- 
ing or natural language processing. Example 
phenotypes for autoimmune hypothyroidism 
and type 2 diabetes are shown in B Fig. 28.4. 


A. This PheWAS identified autoimmune thyroid disease 
associated with this variant and highlighted other con- 
ditions, like atrial flutter, that are inversely associated 
with hypothyroidism. (With permission from Denny 
et al. 2011 © Elsevier) 


Both of these were validated at more than one 
site and demonstrated successful genomic dis- 
covery or replication (Denny et al., 2011; Kho 
et al., 2012). Clearly, sufficient sample size 
(and consequently statistical power) is needed 
to identify associations. Thus, the eMERGE 
network has found it necessary to run pheno- 
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O Fig. 28.4 EHR phenotype algorithms for Autoim- 
mune Hypothyroidism and Type 2 Diabetes. Details for 
each of these algorithms are available on » http:// 


type algorithms across different sites to 
increase sample size. In doing so, eMERGE 
found it helpful to collaboratively develop 


T2DM ICD-9 
code? 


Phenytoin, Dilantin, Infatabs, 
Dilantin Kapseals, Dilantin-125, 
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PheKB.org. Each algorithm was validated by manual 
chart review at multiple institutions. (Figures adapted 
from: a Conway et al., (2011) and b Kho et al., (2012)) 


phenotypes across different sites given local 
variability in EHR systems, billing practices, 
and institutional practices. Algorithms from 
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eMERGE and other networks have been 
shared for use on > PheK B.org (Kirby et al., 
2016). Currently, PheKB houses more than 
150 EHR-based algorithms. 

There are many algorithmic approaches to 
creating high-quality phenotype algorithms. 
Perhaps the most common approach is 
through combinations of different elements 
via Boolean-logic, such as the algorithms 
depicted in @ Fig. 28.4. Other researchers 
have trained machine learning algorithms 
using a set of reviewed cases and controls. 
Among the first demonstration of this 
approach was using the Partners Biobank to 
find rheumatoid arthritis patients (Kurreeman 
et al., 2011; Liao et al., 2010). While algo- 
rithms using machine learning often can be 
overfit on a particular data set, research has 
shown that at least some machine learning 
approaches in EHR data can be portable 
across different EHR systems (Carroll et al., 
2012). This machine learning approach has 
also been applied to Veteran Affairs EHRs 
from the Million Veteran Program to find 
cases of acute ischemic stroke (Imran et al., 
2018). 

A common theme among both rule-based 
and machine learning approaches is incorpo- 
ration of different types of data (e.g., billing 
codes, laboratory data, note content, and 
medication data) from the EHR. @ Table 28.1 
reviews the most common features used by 
algorithms posted in PheKB. In addition, 
rules requiring more than one instance of a 
given data type often improve algorithm per- 
formance as well. A study looking at 10 differ- 
ent diseases across ICD codes, clinical notes, 
and medications specific to diseases demon- 
strated that use of multiple modalities 
improves algorithm improves performance 
more than counting rules within a single data 
type (Wei et al., 2016). In this study, the aver- 
age PPV of a single ICD code instance was 
only 0.37, but this increased to 0.84 when 2 or 
more instances of ICD codes were required. 
Requiring at least 2 different elements 
improved PPV to 0.91. Overall, notes were the 
most sensitive data type. 

The overall process of defining and eval- 
uating a phenotype algorithm is shown in 
O Fig. 28.5. In evaluating a phenotype algo- 


O Table 28.1 Data modalities used in 
phenotyping algorithms available on PheKB 


Public Nonpublic Percent 
(n=44) (n=110) of total 
ICD-9/-10 39 73 73% 
codes 
Medications 31 51 53% 
CPT codes 23 44 44% 
NLP 28 36 42% 
Laboratory 21 37 38% 
test results 
Vital signs 5 14 12% 


Nonpublic algorithms include algorithms in 
development and those whose performance has 
not yet been validated. Data accessed Oct. 15, 
2017 

Abbreviations: CPT Current Procedural Termi- 
nology, /CD-9/-10 International Classification of 
Diseases, Ninth Revision/Tenth Revision, NLP 
natural language processing, PheKB Phenotype 
Knowledgebase 

From Robinson et al., (2018) 


rithm, it has become common to compare the 
algorithm to manual chart review as the “gold 
standard”. Two typical approaches have been 
undertaken for this analysis (Newton et al., 
2013). One is to use clinically-trained profes- 
sionals to evaluate the patient records to deter- 
mine whether or not the individual match the 
given clinical condition being assessed. The 
other approach is to develop a formal chart 
abstraction instrument that trained chart 
abstractors will review the chart to identify 
the elements to determine who it matches the 
case (or control) definition. This approach 
often proposes a set of rules for reviewers to 
validate in the patient record that determine 
if an individual meets a research-defined case 
definition. Regardless of the approach for 
chart review, the process is usually iterative: an 
investigator proposes an algorithm, executes 
it on a population, and then evaluates a set of 
charts. Especially if the chart review is using a 
trained professional instead of a chart abstrac- 
tion instrument, it is best practice to include 
both individuals who match the algorithm 
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O Fig. 28.5 Overview of the general approach to finding a phenotype in the EHR 


and those who do not match the algorithm in 
the chart review to avoid anchoring bias. In 
this review, the order of the charts would be 
randomized and the reviewers blinded to the 
algorithm’s determination of case (or control) 
status. After review, the investigators can cal- 
culate the positive predictive value (PPV) as 
the number of true positives divided by (the 
number of true positives plus the number 
of false positives). Additionally, researchers 
may want to calculate a sensitivity (or recall). 
Calculating recall for common diseases may 
be relatively straightforward but very chal- 
lenging to assess for rare diseases. Thus, many 
reviewers often make simplifying assumptions 
about requirements to be a case. For example, 
many researchers make the assumption that a 
case must have at least one relevant ICD code 
to be a case (Carroll et al., 2012; Liao et al., 
2010). This approach provides a reasonable 
estimate of sensitivity. 

PPV is typically viewed as the most 
important metric performance for a case- 
control study, since it is usually more impor- 
tant to be sure to have high-quality cases and 
controls for a given analysis. For instance, 
review of phenotype algorithm implementa- 


tions on » PheKB.org demonstrated that 
out of 145 posted implementations (as of 
December 2018), 79 (54%) reported PPV and 
33 (23%) reported recall. The condition in 
which sensitivity is more important is when 
the variable is being used as a covariate in an 
analysis. 

To facility EHR based discovery, huge data 
sets are needed. While the growth through the 
early 2000‘s and this decade have often been 
focused on single EHR systems, there is an 
increasing need to combine data across sides 
for discovery. This is important both for the 
needs of amassing the necessary size of this 
size as well as representing diversity of geog- 
raphy, demographics, environmental expo- 
sures, and practice habits, which can vary 
between institutions and geography. In short, 
this effort has been facilitated through use of 
common data representations such as the Fast 
Health Interoperability Resource (FHIR) 
and common data models (CDMs). Most fre- 
quently used common data models have been 
the Informatics for Integrative Biology and 
the Bedside (i2b2) data model, PCORNet, 
and the OMOP data model. These are dis- 
cussed in more detail in > Chap. 25. 
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28.4 Omic Discovery Approaches 


Over the last three decades, the explosion 
of efficient and relatively inexpensive dense 
molecular measures has led to the growth of 
more data-driven molecular investigations 
of traits and diseases. The most commonly 
investigated currently would be genomic 
investigations. Some of these are also finding 
translation into clinical practice (discussed 
in » Sect. 28.7). The research observation 
of genetic pleiotropy (the condition in which 
one gene or genetic variant impacts multi- 
ple phenotypes) combined with more dense 
phenotypic assessments have led to similar 
hypothesis-free tests of association of the 
phenome. Specific technologies are discussed 
further in > Chaps. (9 and 26). 


Genome-Wide Association 
Studies (GWAS) 


28.4.1 


A GWAS systematically surveys polymor- 
phisms across the genome to find variants 
associated with a trait or disease. Variants for 
a GWAS are typically weighted toward more 
common SNPs that can represent a broad 
range of genomic variation based on linkage 
disequilibrium, that is the nonrandom cluster- 
ing of variation in the genome based on inher- 
itance patterns. Thus, relatively small numbers 
of the >3 billion base pairs in the (haploid) 
human genome can represent a large fraction 
of inherited variation in the genome. Most 
GWAS assay >500,000 mostly SNPs common 
SNPs. More recent GWAS have incorporated 
rare variants, such as functional genomic vari- 
ants known to be associated with disease and 
pharmacogenomic variants. 

The first GWASs (@ Fig. 28.3a) was con- 
ducted in 2005 and 2006 and discovered 
genetic variants associated with Age-related 
Macular Degeneration of -100k SNPs 
(Dewan et al., 2006; Klein et al., 2005). The 
modern era of array-based GWAS approach 
with large case control populations identify- 
ing common variants influencing common 
disease was arguably introduced in a large 
scale by the 2007 by the Wellcome Trust Case 
Control Consortium, which successfully iden- 


tified SNPs associated with 7 common dis- 
eases (Wellcome Trust Case Control 
Consortium, 2007). As mentioned above, the 
first GWAS using EHR information to define 
cases and controls was performed in 2010 
(Denny, Ritchie, Crawford, et al., 2010; Kullo 
et al., 2010). Since then, the number of EHR- 
based GWAS and large consortia using EHR 
information has risen dramatically (Wei & 
Denny, 2015). The current inclusion of rou- 
tine healthcare data in very large cohorts (see 
> Sect. 28.6 below) has made reliance on 
healthcare data such as those from EHRs or 
administrative claims data now common- 
place. 

Since 2009, published GWAS have been 
curated and made available via the online 
GWAS Catalog, begun first by NHGRI and 
now hosted by EMBL-EBI (> https://www. 
ebi.ac.uk/gwas/) (MacArthur et al., 2017). By 
2010, more than 500 GWASs had been per- 
formed on a wide variety of traits in common 
disease, and by the end of 2018, this had 
grown to 3675 publications reporting one of 
more GWASs, noting associations between 
87,081 SNPs and phenotypes (Green, Guyer, 
& National Human Genome Research 
Institute, 2011; MacArthur et al., 2017). 


28.4.2 Genomic Sequencing 


The rapidly falling cost of genomic sequencing 
(to several hundred for a research whole exome 
and less than $1000 for a whole genome as of 
the end of 2018) is leading to a dramatic growth 
in use of genetic sequencing. The primary 
added benefit of genomic sequencing to preci- 
sion medicine at the current time is a better and 
more detailed assessment of rare and very rare 
variants through a more comprehensive cover- 
age of the genome. Sequencing approaches 
have enabled the discovery of novel variants for 
common disease and have been especially 
impactful for the uncovering of variants in rare 
disease. Sequencing is routinely used now clini- 
cally to aid cancer care or diagnose rare genetic 
diseases. In research, sequencing is rapidly 
expanding our ability to discover associations 
with rare conditions. The NIH’s Undiagnosed 
Disease Network, for instance, routinely 
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employs whole exome sequencing (WES) or 
whole genome sequencing (WGS) to diagnose 
individuals. As a notable win for sequencing, 
the UDN has been able to diagnose 35% of 
individuals referred into their network, 74% of 
which were made with the addition of genomic 
sequencing to comprehensive clinical pheno- 
typing (Splinter et al., 2018). In addition, they 
have defined 31 new syndromes through their 
comprehensive clinical and molecular assess- 
ments of undiagnosed patients. 


28.43 Phenome-Wide Association 
Studies (PheWAS) 


Growth of EHR based cohorts provided rich 
and diverse phenotype data to complement 
biologic data. Whereas GWASs provided a 
way to assess genomic accession associations 
in a hypothesis-free manner starting around 
2005, GWAS usually assesses only one phe- 
notype at a time. However, the growth of 
GWAS quickly highlighted the occurrence of 
genetic pleiotropy — the condition in which 
one gene influences multiple independent phe- 
notypes. Thus, the rich collection of diverse 
phenotype information in EHRs and other 
growing cohorts provided the ability to simul- 
taneously access phenotype associations in 
the same scanning hypothesis free manner as 
GWAS. The first PheWAS EHR-based aggre- 
gated billing codes into 744 PheWAS “cases” 
(Denny, Ritchie, Basford, et al., 2010). Each 
case was linked to a control group. After identi- 
fication of case and control groups, a PheWAS 
(O Fig. 28.3b) is essentially a pairwise test of 
all phenotypes against an independent vari- 
able, such as a genetic variant or laboratory 
value. For a genetic variant, PheWAS is analo- 
gous to genetic association tests performed in 
a GWAS, with a typical approach employing 
a logistic regression adjusted for demographic 
and genetic variables, such as genetic ances- 
try. The first PheWAS tested seven known 
SNP-disease associations, replicating four 
and suggested a couple of new associations. 
Newer approaches to PheWAS have lever- 
aged increased density of phenotypes from 
the EHR, which current methods mapping 
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ICD9 and ICD10 codes into >1800 pheno- 
type case groups. A 2013 study shows that this 
approach was able to replicate 66% of ade- 
quately-powered SNP-phenotype pairs, and 
also identified several new associations that 
were replicated (Denny et al., 2013). A catalog 
of some of the PheWAS associations found 
to date is available at > http://phewascatalog. 
org. A PheWAS of all phenotypes available 
in the UK Biobank has also been performed 
(> http://www.nealelab.is/uk-biobank). 

PheWAS can essentially be performed on 
any broad collection of phenotypes. 
Researchers have used raw unaggregated ICD 
codes, other aggregation systems of ICD 
codes, or phenotypes collected from observa- 
tional cohorts (Hebbring et al., 2013; Pathak, 
Kiefer, Bielinski, & Chute, 2012; S A 
Pendergrass et al., 2011). The disadvantage to 
using more granular ICD codes is the 
increased number of hypotheses being tested, 
which hinders the statistical power to detect a 
result. Lack of ICD code aggregation can also 
introduce variability in coding practices that 
decreases sample size for a given phenotype, 
such as the number of specific diagnostic 
codes available to represent common condi- 
tions and their complications, such as diabetes 
mellitus subtypes (e.g., with specific codes for 
controlled or uncontrolled glucose status and 
its resulting cardiovascular, renal, or neuro- 
logical complications) or gout (e.g., chronic or 
acute, with or without tophi, etc). 

PheWAS can quickly highlight potential 
pleiotropy of a given genetic variant or other 
independent variable by analyzing for associa- 
tions with multiple phenotypes within a single 
population, one can test the independence of 
the potential pleiotropic findings with subse- 
quent conditioned analyses. Other advantages 
of PheWAS is that they are quick to perform 
and easily implemented through existing R 
packages (> https://github.com/PheWAS/ 
PheWAS) or iteration through common sta- 
tistical packages. A disadvantage of Phe WAS 
is that its phenotypes can be coarse and can 
have both lower sensitivity and PPV than cus- 
tom phenotype algorithms as discussed in 
> Sect. 28.3. Fortunately, these types of bias 
typically biased towards the null. Associations 
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found via PheWAS can require refinement 
and subsequent validation. 


28.4.4 Other Omic Investigations 


In addition to genomics, the growth of a num- 
ber of other omic approaches are providing 
greater insight into an individual’s environ- 
ment, endophenotypes, and molecular mea- 
sures. Some of these include the microbiome, 
proteome, metabolome, and other bioassays. 
Additional dense phenotypic and environmen- 
tal assessments include dense measures of the 
environment and personal sensor-based tech- 
nologies, such as consumer activity monitors. 
Publicly available datasets providing detailed 
measures of pollution, the built environment, 
weather patterns, availability of quality food 
or greenspace, and sociodemographic factors 
are available for linkage via geolocation, linked 
via smartphones and other devices that con- 
tinuously track geolocation. These devices can 
also measure activity and heart rate to provide 
greater insight into a person’s habits and phys- 
iological factors. Today, the clinical impact 
of many of these measures is not yet known. 
However, there growing ubiquity through both 
research and commercial interests are enabling 
deeper investigation into their clinical impact. 
They are also being included in large research 
cohorts (see > Sect. 28.6). 


28.5 Approaches to Using Dense 
Genomic and Phenomic Data 
for Discovery 


28.5.1 Combining Genotypes 
and Phenotypes as Risk 


Scores 


Most genetic variants discovered via GWAS 
have had relatively mild effect sizes for their 
phenotype of interest. However, the size of 
modern GWAS, now involving hundreds of 
thousands of individuals for more common 
traits, have allowed identification of many 
independent genetic loci, sometimes reaching 
into the hundreds of distinct loci (Locke et al., 


2015; Okada et al., 2014; Wood et al., 2014). 
Collectively, these genetic variants can explain 
a much larger percentage of the variance in 
disease risk than the individual risk variants, 
even when the effect sizes of many of the indi- 
vidual variants may be rather small (e.g., hav- 
ing odds ratios of ~1.01). Asa tool, researchers 
have aggregated genetic risk variants into a 
calculated score (called a “genetic risk score”, 
GRS, or “polygenic risk score”, PRS), typi- 
cally as a sum of the presence of the variant 
multiplied by a weight, often taken from a 
regression analysis. These risk scores need to 
account linkage disequilibrium to find inde- 
pendent loci and may also produce a weighted 
model using penalized regression. A simple 
approach can be given as: 


k 
GRS =) w,N, 


i=l 


where w, is the weight for the variant (e.g., the 
log odds ratio from a logistic regression) and 
N, is the number of risk alleles for that variant 
(typically, 0,1, or 2). The clinical advantage of 
a GRS is that it provides a way to evaluate the 
aggregate risk of an individual having a given 
disease that takes into account many typically 
small risk factors. 

For instance, consider breast cancer 
genetic testing. It has long been recognized 
that variants in BRCAI and BRCA2 confer 
significant increased risk of breast cancer to 
carriers of these mutations. While pathogenic 
variants in these genes do confer a large risk 
of breast cancer (lifetime risk of 45-65%), the 
vast majority of breast cancer is not related to 
these variants, since they are present in <1% 
of the general population (Torkamani, 
Wineinger, & Topol, 2018). However, com- 
mon SNPs from a large breast cancer GWAS 
published in 2017 represents about 41% of the 
familial risk of breast cancer (Michailidou 
et al., 2017). Studies with cardiovascular dis- 
eases have found similar results and potential 
clinical utility for PRS. In a study looking at 5 
prospectively-followed cardiovascular cohorts 
with genetic testing found that polygenic risk 
scores and lifestyle factors were independently 
associated with incident cardiovascular events 
(Khera et al., 2016). Moreover, their study 
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identified that patients with high genetic risk 
of cardiovascular disease but healthy lifestyles 
were at similar risk to those with unhealthy 
lifestyles but low genetic risk. Importantly, 
healthy lifestyles decreased cardiovascular 
risk at any genetic risk threshold, suggesting 
the importance of potential preventative life- 
style modifications and therapies in those 
individuals at high genetic risk. 

A more recently introduced approach is to 
do a similar process with phenotypes in a phe- 
notype risk score (PheRS) (Bastarache et al., 
2018). In the initial demonstration of PheRS, 
ICD codes were mapped to phecodes and 
summed weighted based on the inverse log of 
the frequency of the phecode in the EHR: 


PheRS = wm, B 


p=1 


where: 
x,, = 1 if individual, has phenotype, or 0 


otherwise 


w, =log— 
P n, 

where n, is the number of individuals with 
phenotype p. By aggregating phenotypes in a 
similar way as genotypes, a combined score 
can increase the sensitivity to detect the phe- 
notypic impact of a genetic variant. For 
example, the PheRS for cystic fibrosis includes 
component phenotypes such as bronchiecta- 
sis, pneumonia, infertility, and asthma. The 
disease code itself (“cystic fibrosis”) is not 
part of the PheRS. Based on EHR weighting, 
bronchiectasis has a much higher weight than 
asthma since asthma is much more common. 
This approach was used in an initial demon- 
stration exercise looking at ~1200 Mendelian 
diseases that could be tested in an EHR popu- 
lation. In this study, PheRS was able to iden- 
tify diagnosed genetic diseases in the EHR 
using the component phenotypes of disease 
and was also able to be used to identify novel 
pathogenic variants for undiagnosed condi- 
tions. 
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285.2 Mendelian Randomization 


Mendelian randomization (MR) is a technique 
used to provide evidence for the causality of a 
biomarker on a disease state in conditions in 
which randomized controlled trials are diffi- 
cult or too expensive to pursue. For example, 
low density lipoprotein (LDL) and high- 
density lipoprotein (HDL) levels have long 
been associated with myocardial infarctions in 
observational cohorts, but it is unclear 
whether they are markers or causal: perhaps 
these levels are a marker of diet, activity level, 
or other unknown factors. It is essentially 
impossible (and probably unethical) to per- 
form a randomized control trial that alters 
someone’s LDL or HDL levels in isolation. 
However, a number of genetic variants have 
been found that alter LDL and HDL levels. 
Since alleles are randomly distributed to ova 
or sperm during meiosis, studying the impact 
of biomarker-influencing alleles provides a 
naturally occurring randomization of the risk 
factor. Genetic variants are generally not 
associated with behavioral, social, and some 
physiological factors — reducing confounding. 
Thus, by studying the impact on the clinical 
outcome of the variants associated with the 
biomarker, one can assert causality of the bio- 
marker to the outcome of the variant. MR 
has proven a powerful tool in recent years 
(© Fig. 28.6). MR studies have demonstrated 
clear associations between LDL and triglycer- 
ide levels and cardiovascular disease while 
casting doubt on the role of HDL in protect- 
ing against cardiovascular disease (Holmes 
et al., 2015; Voight et al., 2012). The latter is 
particularly interesting as cholesteryl ester 
transfer protein (CETP) inhibitors, medica- 
tions targeted to raise HDL, have so far not 
been successful at reducing mortality 
(Mohammadpour & Akhlaghi, 2013). 

By providing an approach to assess cau- 
sality, MR can also provide an approach to 
investigate potential drug effects. An MR 
approach demonstrated the the lipid-lowering 
agent ezetimibe would reduce cardiovascular 
disease (by studying the clinical impact of 
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O Fig. 28.6 Mendelian 3 3 7 

Randomization (MR) vs. i RCT: Randomized into groups | [ MR: Randomized by allele* ) 
Randomized Controlled = r 

Trials (RCT). MI 


myocardial infarction, Control 
LDL low density lipopro- 
tein levels. *Allele could 


represent a single SNP or 


Treatment 


Variant absent 


Variant present 


group of SNPs (e.g., 


Biomarker “worse” (e.g., higher LDL) | [ Biomarker “better” (e.g., lower LDL) ] 


combined via a genetic 
risk score) | 


| 


Outcome increased (more MIs) 


| f Outcome decreased (fewer Mls) ] 


genetic variants mimicking its effect) prior to 
a randomized controlled trial demonstrated 
this effect (Cannon et al., 2015; Myocardial 
Infarction Genetics Consortium Investigators 
et al., 2014). Similarly, MR has been used 
to show that diabetes is a potential concern 
for PCSK9 inhibitors and combined with 
PheWAS to highlight potential unanticipated 
side effects (Jerome et al., 2018; Schmidt et al., 
2017). 


28.5.3 Using Dense Data-Driven 
Measures to “Redefine” 
Disease 


There is increasing enthusiasm that precision 
medicine will lead to new ways of defining dis- 
ease and selecting treatments that will identify 
more rational therapeutic choices, improve 
our understanding of prognosis, and result in 
more effective disease screening. One example 
is cystic fibrosis pharmacotherapy, for which 
medications have been developed to target 
defects corresponding with specific genetic 
variants (O’Reilly & Elphick, 2013). These 
targeted therapies have dramatic influence on 
the disease course. There is a hope that similar 
approaches could be found for many common 
diseases aiding in drug selection and risk 
stratification for diseases such as depression, 
diabetes, hypertension, heart disease, and 
many other common diseases. Recent studies 
using clinical and molecular information have 
suggested subtypes of type 2 diabetes, heart 
failure, and autism (Ahmad et al., 2014; 
Doshi-Velez, Ge, & Kohane, 2014; Li et al., 
2015); however, the clinical impact of such 


subtypes is unclear. Two examples of where 
targeted treatment for disease is currently 
being utilized in practice are pharmacoge- 
nomics and oncology, which are discussed in 
more detail in > Sect. 28.7. 


28.5.4 Use of Machine Learning 
and Artificial Intelligence 
to Advance Precision 
Medicine 


The focus of this section is largely on the 
application of these methods to advance pre- 
cision medicine. Other discussions on machine 
learning and artificial intelligence appear in 
> Chap. 9. 

Machine learning falls into two major 
classes of approaches: supervised and unsu- 
pervised, with the ability to apply many differ- 
ent algorithms. Supervised machine learning 
approaches tasks use a gold standard set as 
input to learn classifiers designed to optimally 
mimic the training set. Unsupervised machine 
learning learn patterns from the data without 
labeled training sets. Machine learning has the 
potential to augment any classification task, 
and has long been used with clinical data. 
Machine learning has been used for many 
tasks in EHRs (such as in natural language 
processing (Jiang et al., 2011; Y. Wu et al., 
2017)) and in bioinformatics, such as aiding in 
interpretation of genetic variants (Kircher 
et al., 2014). Some of these use cases are refer- 
enced above, such as the learning of phenotype 
classifiers using EHR data and labeled cases or 
controls (Carroll, Eyler, & Denny, 2011; Liao 
et al., 2010; Lin et al., 2015; Peissig et al., 2014). 
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Recent areas of exploration in machine 
learning have seen a rapid rise in deep learn- 
ing approaches. These take massive datasets 
and multi-layered neural networks to learn 
patterns in data that have proven superior 
to other machine learning techniques. They 
typically require very large data sets that have 
previously been unavailable in healthcare or 
biomedical research. However, the recent rapid 
growth of available EHR, genomic, and imag- 
ing data sources is enabling a new potential for 
machine learning to be applied to these data 
sets as well. Recent examples include training 
algorithms to identify malignant skin lesions 
and diabetic retinopathy from retinal scan 
(Esteva et al., 2017; Gulshan et al., 2016). 
These algorithms can present with perfor- 
mance that equals that of trained physicians at 
times. A challenge of these approaches is that 
they required huge data sets: the Gulshan et al. 
algorithm which used nearly 130,000 labeled 
retinal digital images to train its diabetic reti- 
nopathy algorithm (Gulshan et al., 2016). The 
growing availability of large-scale public bio- 
medical data through cohorts such as those 
mentioned in > Sect. 28.6 represent an impor- 
tant opportunity to accelerate such research. 
In addition, a number of for-profit companies, 
such as Alphabet, IBM, and many startups, 
have formed partnerships with diverse clini- 
cal entities from individual healthcare systems 
to the United Kingdom’s National Health 
Service (Saria, Butte, & Sheikh, 2018). 


28.6 Large Cohorts to Advance 
Precision Medicine Discovery 


Clinical care since the 1960s has been dra- 
matically influenced by observational cohort 
studies. Studies such as the Framingham 
Heart Study, the Nurses Health Study, or 
National Health and Nutrition Examination 
Survey (NHANES) have produced dramatic 
insights that have fundamentally changed our 
understanding of modifiable risk factors for 
many diseases. For instance, the Framingham 
Heart Study taught us that blood pressure, 
cholesterol, smoking, and activity contribute 
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to cardiovascular disease risk. Its initial dis- 
coveries were derived from detailed longitudi- 
nal assessment of just under 5000 individuals. 
Most of these epidemiological cohorts have 
largely focused on answering exposure- or 
disease-focused questions. Two developments 
beginning in the early 2000s have brought 
new data resources to advanced discovery in 
more disease-neutral fashion. One has been 
the growth of large national-scale cohorts 
containing diverse phenotypic information 
connected with biosamples, and the other has 
been the growth of incorporation of routinely 
collected healthcare data linked to biologi- 
cal specimens. Example cohorts include the 
UK Biobank, the Million Veteran Program, 
the All of Us Research Program, and the 
China Kadoorie Biobank (@ Table 28.2). 
Collectively, these cohorts will enroll millions 
of individuals across the world for longitudi- 
nal assessment of healthcare outcomes ana- 
lyzed against molecular and environmental 
exposures. Each of these cohorts includes 
both participant-generated survey data and 
healthcare-derived data linked with biospeci- 
mens. Several of these cohorts also include the 
ability to recontact individuals. Collectively, 
these multiple complementary avenues of 
phenotype assessment augment passive 
phenotype collection (e.g., with EHR and 
claims-type data) with participant-provided 
information and the potential for reassess- 
ment with deeper phenotyping along topics of 
interest. Within A// of Us, the participant pro- 
vided survey information and the in-person 
research protocol physical measures are both 
being incorporated into the OMOP Common 
Data Model to simplify comparison of these 
data modalities. Digital engagement through 
email or websites or the collection of health- 
care information enable cost-efficient follow- 
up for healthcare outcomes over long periods 
of time. Some of these larger resources are 
also pioneering newer models of researcher 
access that facilitate broad researcher commu- 
nities to access the environments. An impor- 
tant aspect for these cohorts is the diversity 
of its participants, which is discussed in the 
next section. 
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O Table 28.2 Selected biobanks and cohorts enabling precision medicine 


Biobank Region Start Size Website 

year 
eMERGE U.S. 2007 143,896 > gwas.net 
BioVU U.S. 2007 ~250,000 > victr.vanderbilt.edu 
UK Biobank UK. 2006 512,000 > ukbiobank.ac.uk 
Million Veteran US. 2011 >600,000 > www.research.va.gov/M VP/ 
Program Goal: 1 million default.cfm 
Kaiser Permanente U.S. 2009 240,000 > www.rpgeh.kaiser.org 
Biobank 
China Kadoorie China 2004 510,000 > ckbiobank.org 
Biobank 
All of Us Research US. 2017 >80,000 > joinallofus.org, researchallofus. 
Program Goal: 1 million or org 

more 

Taiwan Biobank Taiwan 2005 86,695 > www.twbiobank.org.tw 


Goal: 200,000 


Geisinger MyCode US. 2007 


>190,000 


> www.geisinger.org/mycode 


Limited to cohorts exceeding 100,000 individuals with biosamples. Sizes reported are as of 11/2018 
eMERGE The Electronic Medical Records and Genomics Network 


28.6.1 Need for Diversity, and Role 
of Precision Medicine 


in Health Disparities 


Health disparities are abundant in health care. 
The same concerns can be said for precision 
medicine, for which variabilities in health 
insurance coverage, access to care, and finan- 
cial situations may alter availability and acces- 
sibility for precision therapies (Bentley, 
Callier, & Rotimi, 2017). However, it is also 
true that precision medicine has the potential 
to identify and help alleviate some health dis- 
parities. Since genetic variants vary by ances- 
try, genetic testing has the opportunity to 
identify those most at risk for adverse events 
based not just on ancestry but on actual car- 
riage of variants. Moreover, drugs tradition- 
ally have not been tested in all diverse 
populations and risk factors may not always 
be identified reach population. For instance, 
individuals of Asian ancestry are at much 
greater risk for severe skin reactions such as 


Stevens-Johnson syndrome from antiepilep- 
tics such as carbamazepine (Phillips et al., 
2018). Similarly, it has been noted that car- 
riage of CYP2C19 loss of function of alleles 
is much more common in individuals of 
Pacific Island descent (Kaneko et al., 1999). 
Since diverse ancestries are often not tested in 
large numbers in clinical trials, the increased 
risks in diverse populations are not necessarily 
noticed. However, genomic testing would 
identify those at greater risk of adverse events 
thus identifying the opportunities to optimize 
care. The specific association of clopidogrel 
and reduced efficacy in individuals of Pacific 
Island descent was a subject of a lawsuit 
(A. H. Wu, White, Oh, & Burchard, 2015). 
Unfortunately, the vast majority of indi- 
viduals who have been genotyped or sequenced 
to date are of European ancestry. For instance, 
a 2016 study noted that 81 percent of all indi- 
viduals who had undergone GWAS at that 
time were of European ancestry, and only 
~4% represented African, Hispanic, or native 
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ancestries (Popejoy & Fullerton, 2016). Those 
latter populations represent about one-third 
of the current US population. A lack of diver- 
sity in genetic testing results in a lack of 
knowledge of the genetic architecture for 
diverse populations. For instance, variance in 
warfarin sensitivity vary by ancestry such that 
the variants needed to accurately guide pre- 
scribing for European and African ancestry 
are different (Perera et al., 2013; Ramirez 
et al., 2012). Moreover, it is known that indi- 
viduals of African ancestry typically require 
higher doses of warfarin. However, most of 
the warfarin pharmacovariants that have been 
identified actually increase sensitivity to war- 
farin rather than reducing it. 

The lack of diversity genotype popula- 
tions affects not only our ability to adequately 
treat individuals with diverse ancestries, it 
also hinders discovery. For instance, the dis- 
covery of rare PCSK9 loss-of-function vari- 
ants as a drug target for cholesterol and 
cardiovascular disease was discovered in 
African Americans (Cohen, Boerwinkle, 
Mosley, & Hobbs, 2006). These loss of func- 
tion variants led to production of monoclonal 
antibodies against PCSK9 that dramatically 
reduce cholesterol levels — and will treat indi- 
viduals of essentially any ancestry (Sabatine 
et al., 2017). 


28.7 Implementation of Precision 
Medicine in Clinical Practice 


Currently, most efforts in precision medi- 
cine implementation focus on genomics. 
This comes in three main flavors: germline 
genomic changes to better tailor drug pre- 
scribing, diagnosing genetic disease, and iden- 
tification of somatic variants to guide cancer 
therapy. A number of networks have been 
funded by the NIH to support these integra- 
tion of genomic medicine into clinical care. 
They include the Implementing Genomics 
Into Practice (IGNITE) network, Electronic 
Medical Records and Genomics (eMERGE) 
Network, Clinical Sequencing Evidence- 
Generating Research (CSER) Network, 
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the Pharmacogenomics Research Network 
(PGRN), and the Newborn Sequencing 
In Genomic medicine and public HealTh 
(NSIGHT) Network (B Table 28.3). 


Cancer genomic testing Perhaps the most 
widespread use of precision medicine currently 
is for somatic variation to target cancer thera- 
pies. Cancer therapies have long recognized the 
contribution of genetic variation to prognosis, 
starting with clinical karyotyping. One of the 
earliest applications of truly targeted therapy 
started with identification of the Philadelphia 
chromosome/translocation, which generates a 
fusion gene product BCR-ABL1. BCR-ABL1 
results in the tyrosine kinase Abl being consti- 
tutively activated and is a marker for acute lym- 
phoblastic leukemia and chronic myeloid 
leukemia. It’s particular relevance to targeted 
therapy was noted in the 1990s when imatinib 
was identified through high throughput screen- 
ing assays of tyrosine kinase inhibitors. 
Randomized controlled trials demonstrated a 
survival benefit on patients with chronic 
myelogenous leukemia (CML), thus leading to 
targeted therapies for individuals positive for 
this translocation. 

The use of genetic changes to guide cancer 
therapy are proliferating rapidly. The growth 
of next generation sequencing of cancer 
patients has resulted in discovery of a number 
of mutations that have been successfully tar- 
geted for therapeutics. Examples include vari- 
ants in BRAF for melanoma; EGFR, ALK, 
ROSI, and others for lung cancer; and many 
others. Hallmarks of genetically-focused ther- 
apies are applicability to smaller populations 
and a potential for fewer side effects com- 
pared to traditional chemotherapy. However, 
they also tend to be more expensive (Tannock 
& Hickman, 2016). 

Given the focused care and workout for 
cancer patients, typical treatment for these 
individuals with cancers that have available 
genetically-targeted therapies is to clinically 
sequence tumor samples. These reports typi- 
cally come in the form of PDFs; however, this 
is not a major impediment to accurate clinical 
care since it is a focus work up guided by pro- 
fessionals very knowledgeable in the field. 
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O Table 28.3 Example projects exploring genetic medicine implementation 


Program Region Website 
eMERGE US. >» gwas.net 
IGNITE US. 


Alabama Genomic US. 


Health Initiative org/aghi 


Undiagnosed Disease U.S. 


Network harvard.edu/ 
Genomics England U.K. > www. 
genomicsengland. 
co.uk/ 
Thailand SE 
Asia 
Sanford US: > imagenetics. 
sanfordhealth.org/ 
All of Us Research US: > joinallofus.org 
Program 


Geisinger MyCode US. 
mycode 


> ignite-genomics.org/ 


> www.uabmedicine. 


> undiagnosed.hms. 


> www.geisinger.org/ 


Comments 


Pharmacogenomics (PGx) and actionable 
Mendelian variants (AMV) for ~34 k 


Research demonstration projects exploring 
family medical history, PGx, APOLI variants 


Community-based with GWAS-based AMV 


WGS, phenotyping for undiagnosed patients 


WGS for rare disease and cancer for 100 k 


Proactive genotyping for SJS/TEN risk alleles 
in carbamazepine-exposed patients 


PGx and AMV among primary care 
population 


Stated goal of PGx and AMV for >1 million 


AMV; about 190 k enrolled 


eMERGE The Electronic Medical Records and Genomics Network, IGNITE Implementing Genomics into 


Practice 


Germline pharmacogenomics Medications 
have variable efficacy and potentials for adverse 
effects based on three major modes of action: 
altered metabolism, on-target side effects, or 
off-target side effects, each of which can result 
from a drug-genome interaction (See > Chap. 
26, > Sect. 26.5 for more details.). A common 
scenario for altered metabolism resulting in 
lack of efficacy would be if a drug is a prodrug, 
meaning that the drug that is administered 
requires activation in vivo (typically by enzymes) 
into its active form. For example, clopidogrel is 
a prodrug that requires activation from 
CY2C19 to its active form 2-oxoclopidogrel 
(Scott et al., 2013). Thus, people with poor 
metabolizing variants of CYP2C19 are more 
likely to experience a lack of clopidogrel effi- 
cacy and be at higher risk of myocardial infarc- 
tions, need for revascularization, stroke, and 
death (Delaney et al., 2012). Similarly, 
decreased metabolism of thiopurines (e.g., aza- 
thioprine) due to TPMT polymorphisms can 


result in excessive bone marrow suppression 
(Relling et al., 2019). Second, drugs can pro- 
duce adverse effects through off-target effects, 
such as an allergic reaction via an interaction 
with the immune system. Examples here 
include severe skin reactions from drugs such 
as carbamazepine and abacavir, which can be 
predicted by certain human leukocyte antigen 
variants (White et al., 2018). Third, drugs can 
have toxicity from on-target effects, such as 
increased sensitivity to warfarin resulting in an 
increased risk of bleeding with higher dose. 
Germline pharmacogenomics holds the 
promise of tailoring medications to an indi- 
vidual’s makeup to enable the “right drug for 
the right person” based on understanding of 
these effects. Unlike cancer genetic testing, 
pharmacogenetics requires a provider to 
potentially alter drug prescribing based on 
understanding of one’s genotype. To allow for 
pharmacogenetics to work, the system must 
be able to intercept a drug order and provide 
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guidance. Drug-genome interactions could be 
accepted either by a computerized system of 
decision support (see > Chap. 24) or via a 
human mechanism, e.g., via pharmacists. For 
decision support to work, the EHR requires a 
structured understanding of one’s genotype, a 
clinical decision support system that can sup- 
port action ability based on both a drug order 
and genotype. 

Pharmacogenomic testing can be ordered 
in either a preemptive or reactive fashion. In 
a preemptive fashion, an individual has phar- 
macogenetic testing prior to drug prescribing. 
Then, when a medication would be prescribed 
that may be altered by one’s genetic makeup, 
the system can intercept the order and rec- 
ommend a genetically-tailored medication 
at the time of the prescribing event, such as 
the decision support alert in @ Fig. 28.7. 
This sort of genetic testing has been deployed 
at Vanderbilt, University of Chicago, and 
Indiana’s INGENIOUS trial (Eadon et al., 
2016; O’Donnell et al., 2012; Pulley et al., 
2012). Further investigation of this approach 
is underway within the IGNITE Network. 

Reactive pharmacogenetic testing is the 
more common approach to genetic testing 
and involves testing an individual when there 
is a specific indication for that test. Research 
has shown that having genetic testing avail- 
able at the time of the prescribing event results 
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in a higher frequency of genetically-tailored 
prescriptions (Peterson et al., 2016). Many 
genetic tests can take several days or more to 
receive results back for actionability, which 
may require a provider to recontact a patient 
to make a therapy change. 


Genomics for disease diagnosis and risk assess- 
ment Clinical genetic testing has often 
occurred within the presence of specialized 
clinic visits with geneticists or genetic counsel- 
ors, most commonly for prenatal screening or 
diagnosis of suspected genetic disease. These 
types of interactions typically require very little 
direct informatics support and results can be 
delivered effectively via send-out paper lab 
results. However, newer approaches underlying 
broaden understanding of individual disease 
risks based on genetics require greater interven- 
tion from informatics systems. While clinical 
use of genetic testing for common disease risk 
(such as through PRS as discussed above in 
> Sect. 28.5.1) is uncommon in clinical care 
now, the explosion of genetic knowledge envi- 
sions a day in which people could clinically 
implement genetic risk to enhance their under- 
standing of their degree of genetic risk for a 
disease. Understandings of genetic risk for dis- 
ease is already implied through resource 
through direct-to-consumer genetic testing, 
discussed in the next section. 


Intermediate Metabolizer - clopidogrel (Plavix) - Rare Risk Allele 
Substitution recommended due to increased cardiovascular risks 


If not otherwise contraindicated: 
O Prescribe prasugrel (Effient) 10 mg daily 
Prasugrel should not be given to patients: 
« history of stroke or transient ischemic attack 
e >= 75 years of age [Current patient age: 51] 


« with body weight < 60 kg [Current patient weight: 59.0 kg as of 10/12/2012] 


© Prescribe ticagrelor (Brilinta) 90 mg twice daily 
Ticagrelor should not be given to patients: 


e history of severe hepatic impairment 
+ intracranial bleed 


æ Continue with clopidogrel (Plavix) prescription 
Primary override reason: 
O Contraindicated for prasugrel or ticagrelor 
O Potential side effects 
© Provider/Patient opts for clopidogrel 
O Cost 


Evidence Link 


O Fig. 28.7 Screenshot of clinical decision support advisor for Clopidogrel pharmacogenetic advice 
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28.8 Sequencing Early in Life 


One crucial complication in the search for 
genomic explanations for any given disease or 
phenotype is the impact of environmental inter- 
actions. Over time, every person on earth is 
exposed to environmental factors that may dif- 
fer based not only on a factory that disposes of 
industrial waste near a drinking water supply or 
the traffic on the street they grew up on, but also 
by the foods they eat, the climates in which they 
live, and the infections they have harbored. 
Those external variables, hard to control for and 
sometimes even to know, can have major effects 
on the downstream products and activities of 
one’s genomic fingerprint. Early in life, however, 
those effects are less pronounced. Of course, the 
impact of the in-utero environment on the well- 
being of the developing fetus is well established. 
But a genetic defect is much more likely to be 
the cause in a newborn with an unidentified dis- 
ease than in an adult patient who has undergone 
a lifetime of environmental insults. In this vein, 
a number of initiatives have been established 
across the US to offer clinical sequencing ser- 
vices for young patients, including programs at 
Children’s Hospital of Philadelphia, Duke 
University, Partners Healthcare, the Baylor 
College of Medicine, and the Medical College 
of Wisconsin. More controversial on paper, and 
not yet being performed in practice, is prenatal 
genome sequencing. Ethicists are exploring the 
potential implications of this possible direction 
(Donley, Hull, & Berkman, 2012). 

Addressing the time and resources needed 
to perform genome interpretation, one striking 
success story was achieved at Children’s Mercy 
Hospitals and Clinics in Kansas City, MO 
(Saunders et al., 2012). Investigators used an 
Illumina HiSeq 2500 machine and an internally- 
developed automated analysis pipeline to per- 
form whole-genome sequencing and make a 
differential diagnosis for genetic disorders in 
under 50 h. The diagnoses in question are 
among the ~3500 known monogenetic disor- 
ders that have been characterized. In this case, 
WGS is not being used to identify novel, previ- 
ously unknown mutations. Rather, it is shorten- 
ing the path to diagnosis to just over 2 days 
instead of the more traditional 4-6 weeks as a 
battery of tests were performed sequentially. 


We offer one final example in which genome 
sequencing was used as a last resort in a medical 
odyssey to identify the cause of a mysterious 
bowel condition in a 4-year-old boy named 
Nicholas Volker (Worthey et al., 2011). Having 
ruled out every diagnosis they could conceive 
of, doctors resorted to exome sequencing, lead- 
ing to the identification of 16,124 mutations, of 
which 1527 were novel. A causal mutation was 
discovered in the gene XIAP. This gene was 
already known to play a role in XLP, or X-linked 
lymphoproliferative syndrome and retrospec- 
tive review showed that colitis had been 
observed in 2 XLP patients in the past. Based 
on these findings, a cord blood transplant was 
performed, and 2 years later, Nic’s intestinal 
issues had not returned. News coverage of this 
story by the Milwaukee Journal Sentinel was 
awarded a Pulitzer Prize for explanatory report- 
ing (Journal Sentinel wins Pulitzer Prize for 
“One in a Billion” DNA series, n.d.). 


28.9 Direct to Consumer Genetics 


In the wake of the human genome project and 
the commoditization of genotypic data, a 
number of companies were founded to pro- 
vide consumers with their own genetic infor- 
mation directly. These direct-to-consumer 
(DTC) genomic companies began making the 
services broadly available when deCODE 
genetics launched the deCODEme service in 
November 2007, followed a few days later by 
23andMe. Navigenics was launched the fol- 
lowing spring. These companies offered con- 
sumers the opportunity to provide a saliva 
specimen or buccal swab through the mail, 
and in exchange to receive genotypic informa- 
tion for a range of known genetic markers. 
Different companies emphasized different 
aspects of genetic testing. Navigenics focused 
on known disease risk markers, while 
23andMe was much broader, including dis- 
ease markers but also ancestry information 
and “recreational” genetic information, for 
example earwax type and the ability to smell a 
distinct odor in urine after eating asparagus. 
Navigenics offered free genetic counseling as 
part of their service, while 23andMe and 
deCODEme provided referrals to genetic 
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counselors. A study of concordance between 
these three services found >99.6% agreement 
among them, but in some cases the predicted 
relative risks differed in magnitude or even 
direction (Imai, Kricka, & Fortina, 2011). 
This disagreement is likely due to differences 
in the specific SNPs and the reference popula- 
tion used to calculate risk. 

From the companies’ perspectives, their 
customers offer a rich resource of genomic 
data for potential research and data mining. 
23andMe created a research initiative called 
23andWe through which they enlist custom- 
ers “to collaborate with us on cutting-edge 
genetic research.”(23andWe: The First Annual 
Update — 23andMe Blog, n.d.) They invite 
users to fill out questionnaires and then use the 
phenotypic information to perform genome- 
wide analysis studies. This approach enabled 
researchers at the company to replicate a num- 
ber of known associations, and to discover a 
number of novel associations, recreational 
though they may be, for curly hair, freckling, 
sunlight-induced sneezing, and the ability to 
smell a metabolite in urine after eating aspara- 
gus (Tung et al., 2011). deCODE, purchased 
by Amgen in 2012, boasts a large number of 
medically significant genetic discoveries to have 
come out of their volunteer registry of 160,000 
Icelanders, more than half of the adult popu- 
lation of that country (SCIENCE | deCODE 
genetics, n.d.). Navigenics was purchased by 
Life Technologies (now part of Thermo Fisher 
Scientific Inc.) in 2012 and no longer offers 
their Health Compass genetic testing service. 


28.10 Conclusion 


Physicians have always sought to provide care 
personalized to the individual. The current era 
of large and deep data about individual 
patients is ushering in the promise of precision 
medicine that tailors care to the individual 
based on factors not previously observable by 
the clinician, such as genomic data, predictive 
patterns derived from mining clinical data, or 
dense sensors tracking activity and heart rate 
at density previously not possible. For preci- 
sion medicine to become a reality, we will need 
informatics, to enable both its discovery and 
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implementation. The irony of the ability to 
personalize care based on an individual’s 
makeup is that it requires huge data sets of 
many individuals densley phenotyped to have 
statistical power to make predictions for rare 
variants, diseases, and outcomes. Thus, preci- 
sion medicine requires that we have large data 
sets that are shareable and available for 
research. We will also need to effectively enroll 
diverse populations and ensure that the data 
includes both molecular data and social behav- 
ioral determinants of health. In addition, the 
ability to make accurate decisions for the indi- 
vidual patient requires implementation in the 
EHR, as the amount of data required to make 
decisions is vast and changing quickly. 


(e) Suggested Readings 

Denny, J. C., Bastarache, L., & Roden, D. M. 
(2016). Phenome-Wide Association Studies as 
a Tool to Advance Precision Medicine. Annual 
Review of Genomics and Human Genetics, 
17(1), 353-373. https://doi.org/10.1146/ 
annurev-genom-090314-024956. Provides an 
overview and history of phenome-wide asso- 
ciation studies. Different approaches to 
PheWAS are described, along with the biases, 
advantages, and disadvantages of each. 

Green, E. D., Guyer, M. S., & National Human 
Genome Research Institute. (2011). Charting a 
course for genomic medicine from base pairs to 
bedside. Nature, 470(7333), 204-213. https://doi. 
org/10.1038/nature09764. Provides an overview 
of the NHGRI strategic plan through 2020, 
including the plan moving discovery in large 
cohorts to implementation in clinical enterprises. 

Kirby, J. C., Speltz, P., Rasmussen, L. V., Basford, 
M., Gottesman, O., Peissig, P. L., ... Denny, J. 
C. (2016). PheKB: a catalog and workflow for 
creating electronic phenotype algorithms for 
transportability. Journal of the American 
Medical Informatics Association, 23(6), 1046- 
1052. https://doi.org/10.1093/jamia/ocv202. 
Introduces the Phenotype KnowledgeBase 
website, which contains phenotype algorithms 
and related comments, plus implementation 
and validation data, for finding cases and con- 
trols for genomic analysis from EHR data. 
The paper includes some summary tables and 
experiences from the first several years of 
uploaded EHR phenotype algorithms. 
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Newton, K. M., Peissig, P. L., Kho, A. N., Bielinski, 


S. J., Berg, R. L., Choudhary, V., ... Denny, J. 
C. (2013). Validation of electronic medical 
record-based phenotyping algorithms: results 
and lessons learned from the eMERGE net- 
work. Journal of the American Medical 
Informatics Association, 20(el), e147-54. 
https://doi.org/10.1136/amiajnl-2012-000896. 
This paper provides best practices and lessons 
learned from the Electronics Medical Records 
and Genomics (eMERGE) Network for how 
research-grade phenotypes are found from 
EHR data. This paper includes phenotype 
algorithm design, creation, and validation pro- 
cess, as well as some experiences regarding 
what worked well and what did not. 


Pulley, J. M., Denny, J. C., Peterson, J. F., Bernard, 


G. R., Vnencak-Jones, C. L., Ramirez, A. H., 
... Roden, D. M. (2012). Operational imple- 
mentation of prospective genotyping for per- 
sonalized medicine: the design of the 
Vanderbilt PREDICT project. Clinical 
Pharmacology and Therapeutics, 92(1), 87-95. 
https://doi.org/10.1038/clpt.2011.371. This 
paper describes one of the first prospective 
implementations of  pharmacogenomics. 
Patients were selected based on their risk for 
potentially needing a medication affected by 
pharmacogenes. They were tested on a multi- 
plexed platform, and then medication recom- 
mendations were provided through 
computer-based provider order entry decision 
support. The first implementation was 
CYP2C19 and clopidogrel (an antiplatelet 
medication), but because the platform tested 
multiple pharmacovariants, drug-genome 
interactions could be added over time. 


Wellcome Trust Case Control Consortium. (2007). 


Genome-wide association study of 14,000 
cases of seven common diseases and 3,000 
shared controls. Nature, 447(7145), 661-678. 
https://doi.org/10.1038/nature05911. This was 
one of the first large scale genome-wide asso- 
ciation studies, which found common genetic 
variants influencing seven common diseases. 
One interesting component, discovered loci 
for type 2 diabetes, was in FTO, whose effect 
on diabetes risk is largely mediated through 
adiposity. This shows the importance of con- 
sidering phenotypes along the causal pathway 
when performing GWAS. 


Q Questions for Discussion 


1. 


N 


Design a study to assess the genomic 
influences of a disease or drug response 
phenotype using EHR data. Who 
would be your cases and controls? What 
features would define each case and 
control, and how would you validate 
that the algorithms you picked for cases 
and controls were indeed finding the 
patients you wanted to find? 

Research studies traditionally have not 
returned their research results to study 
subjects. However, genetic studies are 
on the forefront of changing paradigms 
in this space. What do you think about 
the implications of returning results to 
patients? How would you feel if you 
were a subject in a research study? 
Would you want results back or not? 
What are the implications of returning 
results of actionable genetic variants 
(such as those causing breast and 
ovarian cancer) found incidentally 
during research studies or clinical 
testing purposes? 

What are some ways in which precision 
medicine may improve health disparities 
between different populations? In what 
ways might precision medicine worsen 
them? How can researchers promote 
research that ameliorates this risk? 
What are some requirements for a 
health system or a physician in the 
context of pharmacogenomic testing? 
Given that genomics do not generally 
change over the lifetime, how can a 
patient take their genomic test results 
from one institution to another? What 
technological and non-technological 
solutions could be employed to allow a 
patient to take their genetic results with 
them? 

Discuss the strengths and weaknesses 
of EHRs for precision medicine studies 
of diseases, drug responses, and 
exposures. What kinds of exposures 
and health outcomes does an EHR 
excel at capturing and where would 
traditional survey or in-person 
assessment measures perform better? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= Why is the development and use of IT 
in healthcare so much slower than in 
other industries? 

= How has public policy promoted the 
adoption and use of health IT? 

= How does health IT support national 
agendas and priorities for health and 
health care? 

= Why is it important to measure the 
value of health IT in terms of improve- 
ments in care quality and savings in 
costs? 

= How can public policies safeguard 
patient privacy in an era of electronic 
health information? 

= What are the main policy issues related 
to exchanging health information 
among health care organizations? 

= What are the major tradeoffs for regu- 
lating electronic health records in the 
same way that other medical devices are 
regulated to ensure patient safety? 

= What policies are needed to encourage 
clinicians to redesign their care prac- 
tices to exploit better the capabilities of 
health IT? 

= How does the U.S. approach to health 
IT policy compare with those of other 
countries? 


29.1 Public Policy and Health 
Informatics 


For decades after most industries had adopted 
IT as part of their core business and opera- 
tional processes, clinical care in the U.S. 
remained largely in the paper world. Most 
developed countries adopted health IT sooner, 
especially in primary care. International lead- 
ers have included Denmark, Sweden and the 
Netherlands. However, health systems leaders 
in the U.S. have recognized that public policy 
played a role in the pace and nature of their 
health systems’ adoption and use of IT, and 
that changes in policy had the potential to 
accelerate change. 


The influence of policy can be found 
throughout a health care system. Policies 
shape the structure of health care delivery 
organizations and the markets for medical 
products. Directly or indirectly, policies influ- 
ence the behaviors of all health care stake- 
holders including patients, providers, health 
plans, and researchers. Public policy changes 
can enhance or set back health care delivery 
through incentives, requirements, and 
restrictions. 

In recent years, policy interventions have 
influenced health IT in major ways. In 2004, 
U.S. President George W. Bush established 
the Office of the National Coordinator for 
Health IT.! In 2009, during the Obama 
Administration, the U.S. Congress allocated 
approximately $30 billion to support provid- 
ers’ meaningful use of health IT. In 2015, the 
Medicare Access and CHIP Reauthorization 
Act (MACRA) absorbed the meaningful use 
program as part of a larger effort to harmo- 
nize how the federal government pays health- 
care providers, called the Quality Payment 
Program (QPP). In 2016, the twenty-first 
Century Cures Act included provisions to 
improve patient access to their digital medical 
data and allow them to use the data in appli- 
cations of their choice, which may accelerate 
innovation. Notably, healthcare information 
technology has been one of the few relatively 
non-partisan topics. Governments of many 
other countries have also spent significant 
public funds on health IT and are considering 
related policy issues. 

In this chapter, we review some of the key 
policy goals relevant to informatics and dis- 
cuss how researchers and policymakers are 
trying to address them. We discuss how health 
IT policy goals have changed substantially in 
recent years, from a focus on accelerating 
adoption to a greater emphasis on interoper- 
ability and fostering innovation. Protecting 
privacy of patients’ health information, 
ensuring health IT products are safe for 
patients, and improving medical practice 


1 > http://www.healthit.gov/newsroom/about-onc 
(Accessed 12/9/2012). 
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workflows remain persistent challenges and 
of interest to policy. 

While informatics research has been occur- 
ring for several decades, research in health IT 
policy is still relatively new. As stakeholders 
look to health IT to help address the major 
cost and quality problems in national health 
care systems, we expect the issues discussed in 
this chapter to become more important to 
policymakers and researchers in the fields of 
health policy and informatics. 


29.2 How Health IT Supports 
National Health Goals: 
Promise and Evidence 


Health IT is not an end in itself. Like all tech- 
nology, it is simply a tool for achieving larger 
clinical, social and policy goals, such as 
improving health outcomes, improving the 
quality of care, and reducing costs. Health IT 
has the potential to have a tremendous impact 
on these goals. 

Policymakers, however, are interested not 
only in the promise of health IT but also the 
reality. Like most software products, early 
versions of health IT products tend to have 
many problems, such as bugs, poor usability, 
and difficulties integrating with other prod- 
ucts. Only after the technology matures is it 
possible to realize a larger portion of the 
promised benefits. Policymakers may be reluc- 
tant to invest public funds, which are raised 
primarily in the form of taxes, on technologies 
that have not been shown in empirical studies 
to produce benefits. 

Many studies have demonstrated empiri- 
cal benefits of health IT, especially CPOE (see 
> Chap. 14) and some types of CDS (see 
> Chap. 24) (Jones et al. 2014). Recent stud- 
ies of HIE (see > Chaps. 15 and 18) have also 
found some beneficial effects (Menachemi 
et al. 2018). However, substantial gaps in evi- 
dence exist. For example, many studies come 
from a small number of academic medical 
centers or geographical communities, and it is 
unknown if the benefits in terms of quality, 
safety and efficiency are being realized in 
other settings. Some have described this phe- 


er 


nomenon as a health IT “productivity para- 
dox” because of some observers’ assessment 
that the benefits of IT have so far not justified 
the investment (Jones et al. 2012). Lessons 
from other industries suggest that the sub- 
stantial benefits of IT will eventually be real- 
ized but will require more than just 
improvements in the technology itself. Care 
processes will likely also need to be redesigned 
so that users can take advantage of the tech- 
nology’s potential. New best practices may be 
needed for different care settings. And addi- 
tional studies will likely also be needed to 
demonstrate benefits that may exist but are 
difficult to detect, especially in non-academic 
settings that do not have the expertise or 
incentives to conduct robust evaluation stud- 
ies (see > Chap. 13). 

Despite the limits of the empirical evi- 
dence, policymakers have invested substantial 
sums in health IT hoping that the technology 
will realize its promised benefits and support 
national health goals. Further empirical stud- 
ies will help to identify where health IT has 
been successful and what factors have made 
these investments effective, as well as to iden- 
tify gaps that may benefit from further policy 
efforts. This section presents an overview of 
both the promise and the evidence of how 
health IT supports policy goals. 


29.2.1 Improving Care Quality 


and Health Outcomes 


As informatics professionals understand intu- 
itively, health IT has enormous potential to 
improve care quality and health outcomes, 
which are, of course, central policy goals 
(0 Table 29.1). Just as computers have revo- 
lutionized many other industries, from bank- 
ing to baseball, information technology is 
beginning to revolutionize health care through 
innovative applications. Policymakers in the 
U.S. appear to recognize this potential as 
demonstrated by the multiple pieces of state 
and federal legislation passed in recent years 
related to health IT. This activity began with a 
focus on encouraging adoption and has 
shifted to improving interoperability, patient 
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O Table 29.1 


(selected functionality) 


The promise of health IT 


Health IT Expected Expected 
functionality effect on care effect on cost 
quality 
Electronic Improved Fewer 
health record clinical unnecessary 
(EHR) with decisions, tests 
clinical fewer 
decisions medication 
support and diagnostic 
(CDS) errors, timelier 
follow up 
Health Improved Reduced 
information clinical burden of 
exchange decisions information 
(HIE) gathering, 
reduced 
duplicate 
testing 
Patient More Fewer 
decision aids personalized procedures 
treatment 
Telehealth More timely Fewer office 


and personal 
health records 
(PHR) 


E-prescribing 


and accessible 
interactions 
with clinicians 


Fewer errors 


visits 


Reduced costs 


from errors 


access to records, and innovation. Many other 
countries also specifically encourage adoption 
and use of health IT to improve health care 
quality. 

Electronic health records (EHRs; » Chap. 
14) probably represent the form of health IT 
that has been evaluated most extensively and 
are now widely adopted in hospitals and clin- 
ics. EHRs with CPOE and clinical decision 
support (CDS; > Chap. 24) have been exten- 
sively studied and evaluated in terms of qual- 
ity, safety, and efficiency benefits, with most 
studies finding positive results. For example, 
one study found EHRs with medication- 
related CDS can reduce the number of adverse 
drug events from 30% to 84% (Ammenwerth 
et al. 2008). A study that examined EHR use 
in several hospitals in Texas found that there 
are reduced rates of inpatient mortality, com- 
plications, and length of stay when EHRs are 


used (Amarasingham et al. 2009). Studies like 
these have supported the promotion of EHRs, 
medication-related CDS, and e-prescribing 
and are now widely, but not universally, 
adopted. Other functionalities, such as elec- 
tronic patient decision aids, may have enor- 
mous potential to improve quality, safety and 
efficiency, but have not been evaluated as 
extensively and are not widely adopted 
(Friedberg et al. 2013). 

Another component of health IT that 
may substantially improve quality of care 
is clinical data exchange, which is the abil- 
ity to exchange health information among 
health care organizations and patients (see 
> Chaps. 15 and 18). There is a great need for 
this kind of capability. In the U.S., the typi- 
cal Medicare beneficiary visits seven different 
physicians in four different offices per year on 
average, and many patients with chronic con- 
ditions see more than 16 physicians per year 
(Pham et al. 2007). Not surprisingly, in such 
a fragmented system, information is often 
missing. One study shows that primary care 
doctors reported missing information in more 
than 13% of visits and other studies suggest 
much higher rates of missing data, affecting 
as much as 81% of visits (Smith et al. 2005; 
van Walraven et al. 2008; Tang et al. 1994). A 
study in one community found that there may 
be a need to exchange data among local medi- 
cal groups in as many as 50% of patient visits 
(Rudin et al. 2011). Recent empirical studies 
have shown that real-world implementations 
of electronic clinical data exchange systems 
result in fewer duplicated procedures, reduced 
use of imaging, lower costs, and improved 
patient safety (Menachemi et al. 2018). 
However, these studies were concentrated 
in a small number of HIEs and some were 
restricted to a single vendor; it is not clear to 
what extent the results will generalize to other 
contexts. 

Researchers and policymakers agree that 
improving the quality of health care must 
involve making it more patient-centric, and 
health IT will likely be crucial to achieving 
that goal on a large scale. For example, per- 
sonal health records (PHRs) and patient por- 
tals were promoted by federal requirements in 
the US and are increasingly available — one 
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recent survey found that roughly half of older 
adults have accessed a PHR (Malani 2018). 
PHRs give patients access to their clinical 
data (see > Chap. 11), facilitate communica- 
tion between patients and providers, and pro- 
vide relevant and customized educational 
materials so that patients can take a more 
active role in their care (Tang et al. 2006; 
Halamka et al. 2008; Wells et al. 2014). PHRs 
may also incorporate patient decision-aids to 
help them to make critical health care deci- 
sions, considering their personal preferences 
(Fowler et al. 2011; Friedberg et al. 2013). 
Telehealth technologies, which enable patients 
to interact with clinicians over the Internet 
(see > Chap. 20), may make health care more 
patient-centric by allowing patients to receive 
some of their care without having to go physi- 
cally to the doctor’s office. Few empirical 
studies to date have shown that these technol- 
ogies result in improvements in care quality or 
health outcomes (Milani et al. 2017). 

A concern of policymakers is that there is 
an emerging “digital divide” in health IT, in 
which disadvantaged groups who might ben- 
efit most have less access to health IT than 
more affluent groups. One empiric study of 
this issue found that minority groups were less 
likely to access web-based PHRs and, in gen- 
eral, minorities and disadvantaged groups 
have less web access than other groups (Yamin 
et al. 2011). On the other hand, adoption rates 
of mobile platforms do not show as much of a 
divide and PHRs are increasingly accessible 
via these platforms. Still, policies may be nec- 
essary to ensure the technology is designed 
and implemented with minorities in mind to 
prevent disparities in health care from getting 
worse and to ensure that the improvements in 
care quality enabled by health IT are shared 
by all. 

The digital divide has also been suggested 
to exist among hospitals. One study found 
that although EHRs are widely adopted 
among hospitals, critical access hospitals 
lagged in adoption of performance measure- 
ment and patient engagement functions, sug- 
gesting an “advanced use” digital divide 
(Adler-Milstein et al. 2017). However, even if 
critical access hospitals are slower to adopt 
advanced functionalities, that may not indi- 
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cate a permanent divide but rather a typical 
technology diffusion curve in which some 
organizations adopt faster than others. 

Unfortunately, health IT also has the 
potential to facilitate harmful unintended side 
effects (Bloomrosen et al. 2011). In one study 
involving a pediatric intensive care unit in 
Pittsburgh, patient mortality increased in 
patients transferred in after computerized 
physician order entry (CPOE) was installed 
(Han et al. 2005). The study found that cer- 
tain aspects of the ordering system and some 
of the implementation decisions, restricted 
clinicians’ ability to work efficiently, causing 
delays in treatment, which was especially del- 
eterious because of the urgent nature of the 
children’s conditions. Implementation deci- 
sions involving configuration of the system 
and changes in workflows appear to have been 
the major contributors to the increase in mor- 
tality— the same EHR product was installed in 
another hospital without such adverse impacts 
on mortality (Beccaro et al. 2006). Considering 
the volume of health IT studies, there are rela- 
tively few empirical assessments of adverse 
effects. Nonetheless, questions about the need 
to regulate the safety of EHRs are being 
debated. Balancing the need to protect 
patients from unintended harm is the concern, 
further discussed later in this chapter, that 
over-regulation may impede innovation. Most 
researchers tend to believe that if health IT 
systems are well-designed and implemented 
with close attention to the needs of the users, 
these kinds of unintended consequences can 
be avoided and health IT systems will result in 
tremendous improvements in quality of care 
(Berg 1999). Researchers have developed 
guides to help organizations implement health 
IT in a way that minimizes safety risks and 
improves patient safety (Sittig et al. 2014). In 
addition to unintended consequences on 
patients’ health, IT has also been shown to be 
a source of physician professional dissatisfac- 
tion (Sinsky et al. 2017). 


29.2.2 Reducing Costs 


In addition to improving quality, health IT is 
expected to reduce costs of care substantially 
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(© Table 29.1). Policies that promoted the use 
of health IT were informed by projections 
based on models showing large potential sav- 
ings for many forms of health IT. One study 
by the RAND Corporation estimated that 
EHRs could save more than $81 billion per 
year (Hillestad et al. 2005). Another study 
estimated that electronic clinical data 
exchange has the potential to save $77.8 bil- 
lion per year (Walker et al. 2005). Many of 
these savings were expected to come from 
reductions in redundant tests and use of 
generic drugs, as well as reductions in adverse 
drug events and other errors that EHRs might 
prevent (Bates et al. 1998; Wang et al. 2003). 
Telehealth and PHRs were also projected to 
result in billions of dollars in savings (Kaelber 
and Pan 2008; Cusack et al. 2008). 

One weakness of these projections is that 
they relied on expert opinions for some point 
estimates because, other than several studies 
showing that EHRs reduce costs by reducing 
medical errors, few studies have tried to exam- 
ine empirically the effect of health IT on costs 
(Tierney et al. 1987, 1993). Also, some of the 
projections have been criticized because they 
estimate potential savings rather than actual 
measured savings (Congressional Budget 
Office 2008). However, the projections do not 
include several types of savings that may 
result from providing better preventive care 
and care coordination, which would reduce 
the need for patients’ use of high cost proce- 
dures in hospitals and emergency rooms. They 
also do not include potential reductions in 
costs that may result from decision aids for 
patients, which may, for example, reduce the 
number of unnecessary surgeries (O’Connor 
et al. 2009). And they do not include other 
innovations such as the impact of small 
changes in EHR displays. For example, one 
study found that when the fees associated with 
laboratory tests were shown to clinicians when 
they ordered the test, rates of test ordering 
decreased by more than 8% (Feldman et al. 
2013). The actual savings, therefore, may be 
much greater than the projections suggest. As 
described above, realizing these savings will 
likely require more than simply adopting the 
technology - it will also require redesigning 


healthcare workflows to make greater use of 
the technology, and developing and spreading 
best practices. 


29.2.3 Using Health IT to Measure 
Quality of Care 


All health care stakeholders agree that a 
health care system should deliver high quality 
care. But how does one measure care quality? 
Current methods of quality measurement rely 
largely on administrative claims submitted by 
providers to insurers. These data may be use- 
ful for certain quality measurements such as 
for assessing a primary care physician’s mam- 
mography screening rates, but they lack 
important clinical details, such as the results 
of laboratory tests. They also do not represent 
a comprehensive picture of the care that is 
delivered, assess the appropriateness of most 
medical procedures, or determine if a patient’s 
quality of life has improved after treatment. 
Also, most patients in the U.S. switch insur- 
ance companies every few years, limiting the 
ability of any one insurer to measure quality 
improvements over longer periods of time, 
which is required to assess accurately the 
treatment of many medical conditions. 
Increasingly, clinical data available 
through EHRs are used for quality measure- 
ment (Ancker et al. 2015). Clinical data are 
much more comprehensive than administra- 
tive claims, and methods for measuring clini- 
cal quality using these data are growing. In 
the U.S., there is growing policy interest in 
creating such measures as shown in the 
National Quality Strategy and other reports 
(AHRQ 2017). This approach has been used 
in the United Kingdom (U.K.) where nearly 
200 quality measures have regularly been 
assessed, with up to 25% of payment for gen- 
eral practitioners based on performance on 
these measures (Roland and Olesen 2016). 
While initially popular, U.K. physicians have 
become increasingly disenchanted with the 
administrative requirements of the program. 
There is growing support for developing 
patient-reported outcome measurements 
which may be integrated in PHRs, or obtained 
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through other mechanisms and integrated 
with the patient’s clinical data (Lavallee et al. 
2016). 

However, using electronic clinical data to 
generate quality measures is also associated 
with problems. Studies have found that clinical 
data in EHRs are often incomplete, inaccu- 
rate, and may not be comparable across differ- 
ent EHRs (Chan et al. 2010; Colin et al. 2018). 
Existing measures also tend to focus more on 
adherence to care processes rather than 
patient outcomes (Burstin et al. 2016). More 
research is needed to develop and standardize 
meaningful quality measures that would be 
worth the burden of reporting them. 


O Fig. 29.1 Health care 
expenditures and life 
expectancy in the United 
States and ten other 
developed countries. 
(From Fuchs and Milstein 83 
(2011), with permission © 
Massachusetts Medical 
Society) 
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29.24 Holding Providers 
Accountable for Cost 
and Quality 


Currently, in the U.S., most care is delivered 
using a fee-for-service payment system, in 
which providers are paid for every procedure 
or patient visit. Under this payment method, 
providers have incentives to provide more care 
rather than less, which contributes to over- 
treatment (Lyu et al. 2017). It is therefore not 
surprising to find that in the U.S., costs are 
high and rising, nearly double those of many 
other industrial nations, and quality of care is 
mixed (Squires 2015). As @ Fig. 29.1 shows, 
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the U.S. spends more money per capita on 
health care than any other country by a wide 
margin. Yet, many studies suggest that the 
U.S. is far from the world’s leader in overall 
care quality (Squires 2015). A seminal study 
by McGlynn et al. in 2003 found that patients 
in the U.S. received recommended care only 
about half of the time across a broad array of 
quality measures (McGlynn et al. 2003). An 
updated version in 2016 found that those 
results had not changed much (Levine et al. 
2016). 

Policymakers are trying to replace the 
fee-for-service payment method with other 
methods that would hold providers account- 
able for the care they deliver. These policies 
create incentives for healthcare providers to 
constrain costs and may therefore motivate 
greater use of health IT tools to achieve this 
goal. In the U.S., one of the proposed mech- 
anisms for accomplishing this is through 
Accountable Care Organizations (ACOs). As 
specified in the Affordable Care Act of 2010,? 
an ACO is a group of providers who are held 
accountable, to some extent, for both the 
cost and the quality of a designated group of 
patients (Berwick 2011; McClellan et al. 
2010). ACOs are still a work in progress, but 
early indications suggest that they may 
reduce some costs (McWilliams et al. 2018). 
The concept of ACOs depends on having an 
electronic health information infrastructure 
in place, including widespread use of EHRs, 
because health IT would enable ACOs to 
improve quality, reduce costs, and measure 
their performance. Without prior federal 
incentives for health IT adoption, these poli- 
cies to aim to change incentives may not have 
been feasible. 

Many other countries have experimented 
with paying providers for quality and out- 
comes, or holding providers responsible for 
costs, although few have done both at the 
same time to a high degree. Health IT systems 
are critical for many of these efforts. Few pol- 
icymakers or researchers believe providers can 


2 >» http://www.healthcare.gov/law/index.html 
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be held accountable to a substantial degree for 
the care they delivery without a robust health 
IT infrastructure. 


29.2.5 Informatics Research 


Although EHRs have become widespread, 
many health IT capabilities are still emerging, 
or standards have not yet been defined. New 
applications will still require additional 
research and development. For example, we 
are still in the early stages of understanding 
how to design applications for team care 
(> Chap. 17), remote patient monitoring 
(» Chaps. 20 and 21), online disease manage- 
ment (> Chaps. 11 and 19), clinical decision- 
making (> Chap. 24), alerts and reminders 
(> Chap. 24), public health and disease sur- 
veillance (> Chap. 18), clinical trial recruiting 
(> Chap. 27), and evaluations of the impact 
of technologies on care and costs (> Chap. 13). 
One concern is that most provider organiza- 
tions, and increasingly even academic medical 
centers, are now using software applications 
made by private vendors, and innovating with 
them can be more challenging than with 
homegrown products. Private vendors may 
not be investing enough resources in research 
to produce transformational innovations 
(Shortliffe 2012). It will be essential to iden- 
tify “sandboxes” in which new and innovative 
IT approaches can be developed and tested. 
More interactions between industry and aca- 
demia may be a good way to accelerate prog- 
ress (Rudin et al. 2016). 

Federal funding plays a major role in sup- 
porting this kind of upstream informatics 
research to help to incubate these new tech- 
nologies but is decreasing in recent years. 
Because the benefits of such research will 
accrue to everyone who uses the health care 
system, the investment of public funds is justi- 
fied. Few private companies have taken the 
risk of doing this kind of experimental 
research to date in part because many health 
IT companies have been relatively small and 
were focused on adding the functionalities 
that are needed to meet federal certification 
requirements. More recently, some health IT 
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companies have become larger but they have 
not sponsored much research. It is too early to 
know what impact private companies will 
have on health IT innovation, but historically, 
most of the innovation in health informatics 
has occurred at universities and other 
government-funded research organizations 
affiliated with academic medical centers. 


29.3 Beyond Adoption: Policy 
for Optimizing and Innovating 
with Health IT 


Many governments around the world have 
previously implemented policies to accelerate 
the adoption of health IT. The U.K. achieved 
near universal adoption of EHRs because it 
devoted substantial resources to the effort 
early on and has a national health care system 
which directly manages most of the health 
care providers in the country (Cresswell and 
Sheikh 2009; Ashworth and Millett 2008). 
Most other industrialized nations had 
achieved high levels of adoption in primary 
care by the early 2000s (Jha et al. 2008). 
Countries that achieved particularly high lev- 
els of adoption in non-hospital settings 
include Denmark, the Netherlands, Sweden, 
Hong Kong, Singapore, Australia, and New 
Zealand. Similar to the U.K., these countries 
devoted national resources for this effort. 
Levels of adoption in hospitals, however, 
lagged in many countries. 

In the U.S., after years of slow adoption of 
health IT relative to other developed coun- 
tries, the federal government began to address 
this issue in 2004 by establishing the Office of 
the National Coordinator for Health IT 
(ONC). This office is located within the 
U.S. Department of Health and Human 
Services and tasked with “promoting develop- 
ment of a nationwide Health IT infrastruc- 
ture that allows for electronic use and 
exchange of information.” The importance of 
this office grew considerably in 2009 when 
Congress passed legislation that is considered 
a major landmark in the history of health IT 
policy: the Health Information Technology 
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for Economic and Clinical Health (HITECH) 
Act.’ This legislation authorized $27 billion in 
stimulus funds to be paid to health care pro- 
viders who demonstrate “meaningful use” of 
electronic health records as defined by specific 
criteria (Blumenthal 2010). Although there is 
debate as to the extent to which HITECH 
accelerated EHR adoption in ambulatory 
clinics, EHR adoption increased dramatically 
among hospitals and clinics in the U.S, after 
these incentives were putin place (Mennemeyer 
et al. 2016). Today, over 90% of hospitals and 
clinics have adopted some form of EHR, but 
there is large variation in adoption of specific 
EHR capabilities (HealthIT 2016). 

Now that EHRs have become widely 
adopted, policymakers in many countries are 
shifting focus toward optimizing the technol- 
ogy and fostering innovation to achieve 
greater impact. In the U.S., policy efforts are 
now trying to improve interoperability and 
health information exchange among provid- 
ers and patients and facilitate innovation by 
making health information accessible to third 
party applications using application program- 
ming interfaces (APIs). U.S. policy has also 
incorporated many health IT efforts into a 
larger program that affects how Center for 
Medicare and Medicare Services (CMS) pays 
health providers for services. This section 
describes some of these efforts. 


Health Information 
Exchange 


29.3.1 


All countries have challenges sharing clinical 
data among providers (see ® Chap. 15). For 
many years, U.S. policy promoted data 
exchange through the formation of regional 
health information exchanges (HIEs). These 
organizations provided a variety of services 
including aggregating EHR data from local 
health care providers to create aggregate lon- 
gitudinal patient records, automating the 


3 > https://www.healthit.gov/topic/laws-regulation- 
and-policy/health-it-legislation (Accessed 
10/16/2018). 
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delivery of laboratory results, integrating with 
pharmacies to facilitate e-prescribing, and 
facilitating public health and quality report- 
ing. Although some HIEs are well-established, 
the number of these organizations has been 
declining and many of the remaining ones 
may not be financially viable (Adler-Milstein 
et al. 2016). 

Why is it so difficult to establish an HIE? 
Part of the problem is that EHR products did 
not always use the same technical data stan- 
dards and are not interoperable. Recently 
developed technical and semantic standards 
have made considerable progress in making 
the standards robust (Health Level Seven 
International 2019). However, additional cus- 
tom programming is still required to integrate 
EHRs with HIEs. HIEs face many other chal- 
lenges including: recruiting providers who are 
reluctant to share data with competing medi- 
cal groups, privacy and security concerns, 
legal issues, HIE-related fees, training clini- 
cians to use the HIE, and the lack of a busi- 
ness case (® Chap. 15). The business case 
problem is perhaps the most pressing — for a 
business to thrive, key stakeholders must be 
willing to pay for the product or service. In 
HIE, the primary financial beneficiaries are 
employers and insurers, but they have been 
reluctant to pay for the exchange services 
(Walker et al. 2005). While regional HIEs 
have faltered, EHR vendor-based networks 
have emerged as an alternative, but the extent 
to which they will succeed in the long term is 
uncertain. These networks may be limited to 
one vendor or involve a consortium of ven- 
dors. Currently, the most prominent vendor- 
based networks are Epic CareEverywhere, 
CareQuality, and the CommonWell Health 
Alliance. 

Policymakers have recognized that for data 
exchange to be comprehensive, these networks 
as well as regional HIEs will need to interact 
and share data. To address this concern, there 
are plans to establish a “trusted exchange 
framework” that facilitates this interaction 
(HealthIT 2018). Policymakers have also 
identified information blocking on the part of 
vendors and providers as a concern and plan 
to issue regulations to prevent it. 


Some have proposed a different approach 
to data exchange in which patients can aggre- 
gate and control access to their complete 
health records (Szolovits et al. 1994). The his- 
tory and details of this model are explained in 
> Chap. 15. There has been an increase in 
interest in this approach recently. However, it 
is too early to tell if it will become widely 
adopted. 

No country to date has completely solved 
the problem of clinical data exchange. In 
every country that attempts to foster data 
exchange, the hardest issues appear to be 
socio-political rather than technical, and there 
is clear agreement that health IT policy is par- 
ticularly important to address these problems, 
especially in establishing standards. The U.K. 
has set up a “spine” which allows summary 
care documents to be widely exchanged 
(Greenhalgh et al. 2010). However, the overall 
program has encountered major political dif- 
ficulties, and has been largely dismantled. 
Canada has established a program called 
Canada Health Infoway, which has empha- 
sized setting up an infrastructure for data 
exchange (Rozenblum et al. 2011). While that 
effort has been somewhat successful, relatively 
little in the way of clinical data is being 
exchanged to date, in part because the adop- 
tion rate of electronic health records remains 
low. In Scandinavia, there has been substan- 
tial concern about the privacy aspects of data 
exchange, especially in Sweden, though data 
exchange is taking place in Denmark and its 
use is growing. 


29.3.2 Patient Portals 
and Telehealth 


Although EHRs have become widely adopted, 
other forms of health IT are still lagging. To 
encourage more patient-centric care, many 
countries are trying to foster the adoption of 
Patient Portals and Telehealth (see » Chaps. 
11 and 20). In the U.S., federal incentives pro- 
moted patient portals, and adoption rates are 
growing. To promote telehealth, policymakers 
are exploring the possibility of reimbursing 
for telehealth care, which would probably 
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improve adoption of this technology consid- 
erably (Mehrotra et al. 2016). Even though 
many states have passed “parity laws” that 
require commercial insurers to reimburse for 
telehealth visits, most healthcare encounters 
are still in-person. 


29.3.3 Application Programming 
Interfaces 


To accelerate innovation, policymakers in the 
U.S. have begun to promote Application 
Programming Interfaces (APIs) for EHR 
data. APIs are software mechanisms that 
allow different applications to connect to one 
another and share information. All modern 
software utilizes APIs for purposes ranging 
from communication with a computer’s oper- 
ating system to querying a website for the lat- 
est news stories. In healthcare, one use of 
APIs is to allow patients to more easily down- 
load their latest medical data into an applica- 
tion of their choosing, such as an application 
on their smartphone that helps them organize 
and understand their health data. Another 
use of APls is to allow providers to install 
third party applications for use within their 
EHRs. If patients and providers can pick and 
choose applications, a new market of innova- 
tive applications may arise to take advantage 
of these data. Standardization of APIs across 
EHRs is critical because otherwise applica- 
tion developers will be required to spend effort 
customizing their product to integrate with 
every EHR vendor. In the U.S., new policies 
will require EHR vendors to support APIs as 
a condition for certification and for receiving 
certain payments (Leventhal 2018). 


29.4 Policies to Ensure Safety 
of Health IT 


As adoption of health IT accelerates and new 
innovations are developed, it is important to 
be vigilant about, and to reduce, the risk of 
unintended harmful side effects related to 
health IT use. Harm could arise from deficien- 
cies in many areas when designing and deploy- 
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ing complex systems, including poor usability, 
inadequate testing and quality assurance, 
software flaws, poor implementation deci- 
sions, inattention to workflow design, or inad- 
equate training. Policymakers have funded 
development of frameworks and guidelines to 
help implement and use health IT in a way 
that addresses safety concerns (Sittig and 
Singh 2012). 


29.4.1 Should Health IT 
Be Regulated as Medical 


Devices? 


One policy option for reducing the likelihood 
of health IT-related medical errors is to create 
regulations that require health IT products to 
adhere to strict principles of safe design and 
be tested and certified (see also » Chap. 12) 
(Shuren et al. 2018). This is how many medi- 
cal devices are regulated by the U.S. Food and 
Drug Administration.* While this approach 
may ensure some degree of patient safety, the 
regulatory burden will increase the price of 
health IT systems, raise barriers of entry for 
new companies, and could stifle innovation. 
Also, even with regulations, health IT prod- 
ucts might still have safety issues because soft- 
ware products can be used in many ways, 
unlike other medical devices that have more 
limited utility. 

There is an active debate about the appro- 
priate types of regulation for medical apps for 
use by patients. Currently estimates have 
found more than 150,000 health apps avail- 
able for download, but analysts have found 
that few have demonstrated clinical utility 
(Singh et al. 2016). The FDA does not regu- 
late most apps but has recently begun a pilot 
“precertification” program for digital health 
which will provide information about ven- 
dors’ software quality control processes but 
does not involve evaluations of outcomes 
(Bates et al. 2018; Lee and Kesselheim 2018). 
This is controversial, and some feel it does not 
go far enough (Bates et al. 2018). 


4 > http://www.fda.gov/ (Accessed 12/10/12). 
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29.4.2 Alternative Ways to Improve 
Patient Safety 


There are many other policy options to sup- 
port patient safety (Committee on Patient 
Safety and Health Information Technology; 
Institute of Medicine 2011). Policies may fund 
training programs to educate clinicians in how 
to use health IT safely and alert them to com- 
mon mistakes. Policies might encourage pro- 
viders to report problems with software, 
including usability issues and bugs, so that 
vendors can fix them quickly. Policies might 
also help to establish programs in which users 
can rate health IT products. Finally, funding 
research into the science of patient safety 
would improve our knowledge of how to 
design better products and identify risks of 
errors (Shekelle et al. 2011). 


29.5 Policies to Ensure Privacy 
and Security of Electronic 
Health Information 


It is almost impossible to have a conversation 
about digital medical records without discuss- 
ing issues of privacy and security. Although 
the topic of privacy arose in the discussion of 
ethics in > Chap. 12, it also has policy impli- 
cations and warrants mention here. As health- 
care has become digitized, there has been an 
increase in security events (Liu et al. 2015). 
Protecting privacy and security are clearly 
important policy goals. 


29.5.1 Regulating Privacy 


The Health Insurance Portability and 
Accountability Act (HIPAA) of 19965 and 
subsequent regulations created a legal cate- 
gory of “protected health information” 
which was defined to encompass most forms 


5 » http://www.hhs.gov/ocr/privacy/index.html 
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of clinical data. Covered entities which 
include providers and insurers are legally 
required under this law to safeguard elec- 
tronic health information and would face 
fines if they did not. 

Many states have additional privacy laws 
regarding data exchange (e.g., mental health 
and HIV status). The effectiveness of these 
privacy-protective laws has not been rigor- 
ously evaluated. They can inadvertently 
reduce privacy protection, particularly when 
exchanging data across state lines, and have 
been showed to slow the adoption of EHRs 
(Miller and Tucker 2009; Harmonizing State 
Privacy Law Collaborative 2009). 

In other countries, privacy also has received 
a good deal of debate. Most recently, the 
European General Data Protection Regulation 
(GPDR) went into effect in 2018 and goes 
beyond healthcare in scope by encouraging 
“privacy by design” for all software prod- 
ucts that store personal data (Haug 2018). 
Governments are still trying to find the best 
policies to protect privacy of medical records 
without slowing the adoption of health IT. 


29.5.2 Security 


Now that healthcare entities are mostly digi- 
tal, they are increasingly targeted by cyber- 
attacks, which may aim to steal patient data, 
demand money in return for unlocking a 
system, or make a political statement. 
HIPAA includes security policies that 
require health providers and other covered 
entities to implement various safeguards, 
and if data are breached, the federal govern- 
ment may charge a fine. The recent increase 
in cyberattacks on hospital and other health- 
care stakeholders suggest that these regula- 
tions may not be adequate, and policymakers 
are considering additional moves. Security 
concerns exist in all countries. For example, 
the UK’s National Health Service recently 
experienced a cyberattack that crippled 
many hospitals and required many clinics to 
close down completely (Clarke and 
Youngstein 2017). 
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29.5.3 Record Matching 
and Linking 


For health IT to be effective, an essential 
prerequisite is that patients must be matched 
to their health data, and electronic records 
for the same patient must be linked together. 
If patients’ identity attributes are used (e.g., 
name, address, date of birth), matching and 
linking errors often occur because many 
patients share attributes, attributes change 
over time, and clerical errors are common. 
Many countries have adopted a unique 
health identifier (UHI) to facilitate these 
processes. However, in response to concerns 
of privacy advocates, the U.S. congress pro- 
hibited use of HHS to expend federal dol- 
lars to support development of a UHI. There 
is little evidence that suggests UHIs pose an 
increased risk of privacy violations and, in 
fact, not having a UHI may be even more 
risky because many other kinds of personal 
data may be collected and used instead 
(Greenberg et al. 2009). But UHIs require 
substantial federal resources to implement 
and may not address all matching and link- 
ing issues. Currently, some estimates suggest 
that errors in linking records shared across 
providers in the U.S. can be as high as 50%. 

Policymakers are therefore interested in 
alternative approaches, which include improv- 
ing linking algorithms to better match iden- 
tity attributes (e.g., name, address, date of 
birth), defining standards for the identity 
attributes, using biometrics-based methods, 
and allowing patients to participate more 
directly in the process, such as by verifying 
their phone number with their mobile phone 
or managing their data on their smartphone 
(Rudin et al. 2018). There are advantages and 
disadvantages to every approach, and it is 
likely that multiple approaches will be needed 
to substantially reduce matching and linking 
errors (Pew 2018). Policymakers may play a 
critical role in overseeing progress and sup- 
porting research to develop and more rigor- 
ously evaluate solutions. 
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29.6 The Growing Importance 
of Public Policy in Informatics 


Public policy is becoming increasingly impor- 
tant to the field of informatics. Policies affect 
everything from what research projects receive 
funding to whether a physician in a solo prac- 
tice allows her patients to access their medical 
records online. Many of the health IT policy 
issues we discuss in this chapter are just begin- 
ning to attract attention from policymakers, 
and further research is needed to understand 
the best role for policy. It is likely that new 
policy issues will emerge as technology capa- 
bilities become more advanced. For example, 
artificial intelligence may help with many clin- 
ical applications, but policies may be needed 
to ensure it is applied safely and to ensure 
accountability. 

Traditionally, most informatics research 
has focused on the development of new tech- 
nologies and how they integrate into clinical 
practice. Relatively few studies provide advice 
to policymakers on health IT policy issues, 
even though policies have enormous conse- 
quences for informatics research and practice. 
We hope that researchers and policymakers 
will recognize that technology and policy 
issues affect each other, and it is necessary to 
use both perspectives to understand how 
information technology can be used to 
improve health care. 


(e) Suggested Readings 

Agency for Healthcare Research and Quality. 
(2013). A robust health data infrastructure. 
Retrieved from McLean, VA: https://www. 
healthit.gov/sites/default/files/ptp13-700hhs_ 
white.pdf. This white paper makes the case for 
public policy to promote open APls to 
improve interoperability and data exchange, 
and to promote innovation in healthcare. 

Bloomfield, R. A., Jr., Polo-Wood, F., Mandel, 
J. C., & Mandl, K. D. (2017). Opening the Duke 
electronic health record to apps: Implementing 
SMART on FHIR. International Journal 
of Medical Informatics, 99, 1-10. https:// 
doi.org/10.1016/j.ijmedinf.2016.12.005. This 
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research study discusses a successful early 
attempt to use APIs within a live EHR and 
emerging technical standards to implement 
patient- and provide-facing apps. 

Clarke, R., & Youngstein, T. (2017). Cyberattack 
on Britain’s National Health Service — a wake- 
up call for modern medicine. New England 
Journal of Medicine, 377(5), 409-411. https:// 
doi.org/10.1056/NEJMp1706754. This brief 
perspective describes a harrowing cyber attack 
on the U.K.’s healthcare system and offers 
suggestion to help improve preparedness. 

Jones, S. S., Heaton, P. S., Rudin, R. S., & 
Schneider, E. C. (2012). Unraveling the IT 
productivity paradox — lessons for health care. 
New England Journal of Medicine, 366(24), 
2243-2245. This brief perspective addresses 
the contentious issue of why few studies have 
been able to show that health IT produces an 
improvement in economic productivity. It 
makes its case by pointing out that the IT 
industry had the same problem in the 1980s 
and 1990s but managed to overcome these dif- 
ficulties through better measurement of pro- 
ductivity, improved management of 
technology, and better usability. 

Sinsky, C., Colligan, L., Li, L., Prgomet, M., 
Reynolds, S., Goeders, L., et al. (2016). 
Allocation of physician time in ambulatory 
practice: A time and motion study in 4 spe- 
cialties. Annals of Internal Medicine, 165(11), 
753-760. https://doi.org/10.7326/M 16-0961. 
This study reported direct observation of 57 
U.S. physicians and found they spend almost 
50% of their time on EHR and desk work, 
which was much more than time on direct 
clinical face time with patients. Other work by 
some of the same authors have identified 
EHRs as a source of professional dissatisfac- 
tion and burnout. 

Sittig, D. F, & Singh, H. (2012). Electronic 
health records and national patient-safety 
goals. New England Journal of Medicine, 
367(19), 1854-1860. https://doi.org/10.1056/ 
NEJMsb1205420. This article proposes a 
3-phased approach to implementing EHRs 
in a way that improves safety: address safety 
concerns unique to EHR technology, mitigate 
safety concerns arising from failure to use 
EHRs appropriately, and use EHRs to moni- 
toring and improve patient safety. 


Q Questions for Discussion 

1. What are the key barriers to effective 
use of EHRs and exchange of health 
information? Which of these challenges 
are amenable to public policy decisions? 

2. What are the key barriers to innovation 
in health IT? What can be done to accel- 
erate innovation? 

3. What might be some of the tradeoffs of 
using administrative claims data com- 
pared with using clinical data from health 
IT systems for care quality analysis? 

4. What might be some of the tradeoffs of 
promoting health IT by paying for use 
compared with paying for quality? 

5. Should health IT be regulated the same 
way as devices are regulated to protect 
patient safety? Why or why not? 

6. If research finds strong evidence of a 
digital divide in health IT, what policy 
actions should be taken? 

7. What kinds of health IT functionality 
are needed to support accountable care 
organizations and patient-centered 

medical homes? 
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© Learning Objectives 

After reading this chapter, you should know 

the answers to these questions: 

= What does the past evolution of the field 
of biomedical informatics tell us about 
its future trajectory? 

= How will data science methods influence 
biomedical informatics research 

= What roles will electronic health records 
and artificial intelligence play in health 
care of the future? 


30.1 The Present and Its Evolution 
from the Past 


Every good look forward should start with a 
look back to provide perspective regarding the 
past and an assessment of the pace of change, 
thereby helping us to anticipate a trajectory for 


the future. This book first appeared in 1990 at a 
time when the field was much younger (the word 
informatics had come into common use only in 
the previous decade) and was still being defined. 
Thus that early edition, and the ones that fol- 
lowed (in 2000, 2006, and 2014) offer a glimpse 
of what topics appeared over time, which ones 
faded away, and how even the terminology 
evolved (as it will no doubt continue to do in the 
future). Consider, for example, the list of chap- 
ter titles from the 1990 edition (@ Table 30.1). 
The first edition was titled Medical Informatics: 
Computer Applications in Medical Care, reflect- 
ing the field’s original roots in clinical medicine. 
In those days, the field was called medical infor- 
matics (see > Chap. 1) and the first edition was 
focused largely on clinical application areas, 
such as electronic health records, nursing sys- 
tems, laboratory systems, radiology system, and 
education systems. 


© Table 30.1 
matter 


Medical 
Informatics: 
Computer 
Applications in 
Medical Care 
(1990) 


Recurrent themes 
in medical 
informatics 


1. The computer 
meets medicine: 
Emergence of a 
discipline 


2. Medical data: 
Their acquisition, 
storage, and use 


3. Medical 
decision making: 
Probabilistic 
medical 
reasoning 


4. Essential 
concepts for 
medical 
computing 


Medical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 


(2000) 


Recurrent themes 
in medical 
informatics 


1. The computer 
meets medicine 
and biology: 
Emergence of a 
discipline 


2. Medical data: 
Their acquisition, 
storage, and use 


3. Medical 
decision-making: 
Probabilistic 
medical reasoning 


4. Essential 
concepts for 
medical computing 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 
(2006) 


Recurrent themes 
in biomedical 
informatics 


1. The computer 
meets medicine and 
biology: Emergence 
of a discipline 


2. Biomedical data: 
Their acquisition, 
storage, and use 


3. Biomedical 
decision making: 
Probabilistic 
clinical reasoning 


5. Essential 
concepts for 
biomedical 
computing 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 


(2014) 


Recurrent themes 
in biomedical 
informatics 


1. Biomedical 
informatics: The 
science and the 
pragmatics 


2. Biomedical data: 


Their acquisition, 
storage, and use 


3. Biomedical 
decision making: 
Probabilistic 
clinical reasoning 


Table of contents sections and chapters from all five editions of this book, aligned by subject 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 


(2020) 


Recurrent themes in 
biomedical 
informatics 


1. Biomedical 
informatics: The 
science and the 
pragmatics 


2. Biomedical data: 
Their acquisition, 
storage, and use 


3. Biomedical 
decision making: 
Probabilistic 
clinical reasoning 
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O Table 30.1 (continued) 
Medical Medical 
Informatics: Informatics: 
Computer Computer 
Applications in Applications in 
Medical Care Health Care and 
(1990) Biomedicine 
(2000) 


5. System design 
and evaluation 


5. System design 
and engineering 


6. Standards in 
medical 
informatics 


7. Ethics and 
health informatics: 
Users, standards, 
and outcomes 


8. Evaluation and 
technology 
assessment 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 
(2006) 


4. Cognitive science 
and biomedical 
informatics 


6. System design 
and engineering in 
health care 


7. Standards in 
biomedical 
informatics 


8. Natural 
language and text 
processing in 
biomedicine 


9. Imaging and 
structural 
informatics 


10. Ethics and 
health informatics: 
Users, standards, 
and outcomes 


11. Evaluation and 
technology 
assessment 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 
(2014) 


4. Cognitive science 
and biomedical 
informatics 


5. Computer 
architectures for 
health care and 
biomedicine 


6. Software 
engineering for 
health care and 
biomedicine 


7. Standards in 
biomedical 
informatics 


8. Natural 
language 
processing in 
health care and 
biomedicine 


9. Biomedical 
imaging 
informatics 


10. Ethics in 
biomedical and 
health informatics: 
Users, standards, 
and outcomes 


11. Evaluation of 
biomedical and 
health information 
resources 
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Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 
(2020) 


4. Cognitive 
informatics 


5. 
Human-computer 
interaction, 
usability, and 
workflow 


6. Software 
engineering for 
health care and 
biomedicine 


7. Standards in 
biomedical 
informatics 


8. Natural 
language 
processing for 
health-related texts 


10. Imaging and 
structural 
informatics 


9. Bioinformatics 


11. Personal health 
informatics 


12. Ethics in 
biomedical and 
health informatics: 
Users, standards, 
and outcomes 


13. Evaluation of 
biomedical and 
health information 
resources 


(continued) 
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© Table 30.1 (continued) 
Medical Medical 
Informatics: Informatics: 
Computer Computer 
Applications in Applications in 
Medical Care Health Care and 
(1990) Biomedicine 
(2000) 
Medical Medical computing 
computing applications 
applications 


6. Medical-record 
systems 


7. Hospital 
information 
systems 


8. Nursing 
information 
systems 


9. Laboratory 
information 
systems 


10. Pharmacy 
systems 


11. Radiology 
systems 


12. Patient- 
monitoring 
systems 


13. Information 
systems for office 
practice 


9. Computer-based 
patient record 
systems 


10. Management 
of information in 
integrated delivery 
networks 


12. Patient care 
systems 


11. Public health 
and consumer uses 
of health 
information 


14. Imaging 
systems 


13. Patient 
monitoring systems 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 
(2006) 


Biomedical 
informatics 
applications 


12. Electronic 
health record 
systems 


13. Management of 
information in 
healthcare 
organizations 


16. Patient-care 
systems 


15. Public health 
informatics and the 
health information 
infrastructure 


14. Consumer 
health informatics 
and telehealth 


18. Imaging 
systems in 
radiology 


17. Patient- 
monitoring systems 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 


(2014) 


Biomedical 
informatics 
applications 


12. Electronic 
health record 
systems 


13. Health 
information 
infrastructure 


14. Management 
of information in 
health care 
organizations 


IS, 
Patient-centered 
care systems 


16. Public health 
informatics 


17. Consumer 
health informatics 
and personal 
health records 


18. Telehealth 


20. Imaging 
systems in 
radiology 


19. Patient 
monitoring systems 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 


(2020) 


Biomedical 
informatics 
applications 


14. Electronic 
health records 


15. Health 
information 
infrastructure 


16. Management 
of information in 
health care 
organizations 


I7: 
Patient-centered 
care systems 


18. Population and 
public health 
informatics 


19. mHealth and 
applications 


20. Telemedicine 
and telehealth 


22. Imaging 
systems in 
radiology 


21. Patient 
monitoring systems 


The Future of Informatics in Biomedicine 


O Table 30.1 (continued) 

Medical Medical 

Informatics: Informatics: 

Computer Computer 

Applications in Applications in 

Medical Care Health Care and 

(1990) Biomedicine 
(2000) 

14. 15. Information 

Bibliographic- retrieval systems 


retrieval systems 


15. Clinical 
decision-support 
systems 


16. Clinical 
research systems 


17. Computers in 
medical 
education 


18. 
Health- 
assessment 
systems 


Medical 
informatics in the 
years ahead 


19. Health-care 
financing and 
technology 
assessment 


20. The future of 
computer 
applications in 
health care 


16. Clinical 
decision support 
systems 


17. Computers in 
medical education 


18. Bioinformatics 


Medical 
informatics in the 
years ahead 


19. Health care 
and information 
technology: 
Growing up 
together 


20. The future of 
computer 
applications in 
health care 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 


(2006) 


19. Information 
retrieval and digital 
libraries 


20. Clinical 
decision-support 
systems 


21. Computers in 
medical education 


22. Bioinformatics 


Biomedical 
informatics in the 
years ahead 


23. Health care 
financing and 
information 
technology: A 
historical 
perspective 


24. The future of 
computer 
applications in 
biomedicine 


Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 
(2014) 


21. Information 
retrieval and digital 
libraries 


22. Clinical 
decision-support 
systems 


26. Clinical 
research 
informatics 


23. Computers in 
health care 
education 


24. Bioinformatics 


25. Translational 
bioinformatics 


Biomedical 
informatics in the 
years ahead 


27 health 
information 
technology policy 


28. The future of 
informatics in 
biomedicine 
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Biomedical 
Informatics: 
Computer 
Applications in 
Health Care and 
Biomedicine 
(2020) 


23. Information 
retrieval 


24. Clinical 
decision-support 
systems 


27. Clinical 
research 
informatics 


25. Digital 
technology in 
health science 
education 


(see > Chap. 9 
under “recurrent 
themes in 
biomedical 
informatics”, 
above) 


26. Translational 
bioinformatics 


28. Precision 
medicine and 
informatics 


Biomedical 
informatics in the 
years ahead 


29. Health 
information 
technology policy 


30. The future of 
informatics in 
biomedicine 
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The next decade was revolutionary, how- 
ever, and it had a profound effect on informat- 
ics. During the 1990s, the Human Genome 
Project made it clear that much of what needed 
to be accomplished in human biology and genet- 
ics could not be achieved with the use of the 
computational methods available or introduced 
at that time. Many of the informatics techniques 
that had been developed in the clinical world 
became relevant to genomics research, where 
investigators coined the term bioinformatics for 
their computational explorations. Thus, the field 
of informatics began to broaden to span both 
basic and applied clinical sciences. In an effort to 
acknowledge this evolution, the second edition 
of this book was renamed, with “medical care” 
giving way to “health care” (to acknowledge the 
field’s growing role in prevention and public 
health) and the addition of “biomedicine” (to 
embrace the role of informatics in human biol- 
ogy research) (@ Table 30.1). In addition, a new 
chapter on bioinformatics was added to the edi- 
tion when it appeared in 2000. Similarly, the sec- 
ond edition took a broader view to consider 
topics such as standards, ethics, integrated deliv- 
ery networks, and public health. Bibliographic 
retrieval expanded to be information retrieval, 
and there were changes in emphasis in several 
other chapters as well. 

In an attempt to acknowledge and empha- 
size the shared methods that applied in both 
the human life sciences and in clinical medicine 
and health, the academic discipline began to 
change its name from “medical informatics” to 
“biomedical informatics”. Several departments 
were renamed or created with this new name 
for the field. Hence, when the third edition of 
this book appeared in 2006, it adopted the title 
Biomedical Informatics, discarding the more 
limited “medical informatics” focus. Although 
several chapters were simply updated and some 
were deleted, others were divided into two 
components (e.g., the Imaging Systems chapter 
from the second edition was divided into a 
methodologic chapter on imaging/structural 
informatics plus an application chapter on 
Imaging Systems in Radiology) (@ Table 30.1). 
Furthermore, totally new chapters were drawn 
from other fields, including cognitive science, 
natural language processing, and consumer- 


facing systems. In addition, chapter author- 
ship evolved substantially as new topics were 
introduced and authors from earlier editions 
brought on coauthors whose expertise comple- 
mented their own. 

The book title remained unchanged in the 
fourth edition in 2014 (and in this edition), 
but changes to the chapter titles provide more 
detail to what was evolving (@ Table 30.1). The 
fourth edition introduced several new topics, 
including telehealth, translational bioinformat- 
ics, and clinical research informatics. And the 
current edition has added new chapters in the 
areas of human-computer interaction, 
mHealth, and precision medicine. 

Thus, a review of the titles and tables of 
contents of the five editions of this book, span- 
ning 30 years with 20 chapters at the outset 
evolving to 30 chapters now, provides a thumb- 
nail view of the evolution of the field as a 
whole. So, what evolution can we observe and 
what does it tell us about where we are headed? 
We see a field that started with a strong focus 
on computer programming for clinical medi- 
cine and ancillary services. The field grew to 
embrace research areas related to medicine and 
health, ranging from molecular to biologic sys- 
tems to organisms, and beyond to populations. 
And, as with any emerging discipline, biomedi- 
cal informatics began to differentiate its activ- 
ities into practice (the lion’s share), research 
(not just biomedical research informatics, but 
research on informatics in its own right), and 
education. Connections among these three 
types of activities and overlap across domains 
were often scant, as shown in © Fig. 30.1. Four 
trends, described throughout this book, have 
blurred many of the distinctions (B Fig. 30.2). 

First, the broader field of biomedicine itself 
has begun to blur the distinctions among its 
traditional areas of research domains. This is 
evident in the emergence of the “translational 
science” philosophy, with clear recognition that 
each scientific endeavor builds on the discover- 
ies made at some other, usually smaller scale. 
One culmination of this trend is precision medi- 
cine (see > Chap. 28), in which discoveries at 
the genomic level are translated into knowl- 
edge that supports decisions for patients and 
populations. 
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Informatics 
Practice 


Informatics 
Research 


y 


Informatics 
Education y 
Basic Science Translational Science 
OFig. 30.1 Prior state of biomedical informatics domains 


and activities. Clinical informatics was the earliest and con- 
tinues to be the largest domain, as reflected in this book, 
with other domains following in time. The advent of the 
Human Genome Project led to rapid expansion of the 
application of bioinformatics. Research in each domain 


Informatics 
Practice 


Informatics 
Research 


Informatics 
Education 


Basic Science Translational Science 


O Fig. 30.2 Today, biomedical informatics is becoming a 
continuum, with fewer distinctions among domains and 
activities. This mirrors the continuum of biomedicine, with 
its recognition of the translational nature ranging from basic 
science public health (1), exemplified by precision medicine 
which draws on genomic knowledge to directly improve care 
of the individual patient. Increased clinical informatics 
activity has resulted in increased availability of clinical data, 
which informs research to produce better evidence to guide 
practice, resulting in a “learning health system” (2). Users of 


Second, the learning health system (> Chaps. 
1 and 17) is making use of lare-scale data col- 
lected from patients and populations to frame 
research questions that are answered in the labo- 
ratory. Knowledge discovered there is then 
returned to the point of care to support evi- 
dence-based diagnostic, preventive and thera- 
peutic decisions. 

Third, informatics research is moving from 
the computer lab out to where potential users 
of informatics tools are actually working. 
Harnessing tools from cognitive science and 
mixed methods evaluation, the biological lab- 


Clinical Research 


Clinical Research 


Patient Care Public Health 
followed practice, depicted here as more or less semi-trans- 
parent connections. Education in informatics, both for 
research and practice, began in the clinical domain, espe- 
cially with nursing informatics, to be followed by nascent 
bioinformatics training programs. Connections between 
the education and research varied 


Public Health 


Patient Care 


informatics applications in research and practice settings 
are increasingly seen as research subjects in “living labo- 
ratories” (3) for guiding the improvement of the tools they 
use and learning new ways to apply informatics methods. 
Finally, education and training in informatics is increas- 
ingly leaving the classroom and moving to practice sites 
for observation and learning and as settings for experi- 
menting with new solutions (4). (> https://systems.jhu.edu/ 
research/public-health/2019-ncov-map-faqs/. Copyright 
2020, Johns Hopkins University. All rights reserved) 


oratory, the clinic, and the hospital are becom- 
ing living informatics laboratories for testing 
informatics ideas and observing their impact. 

Fourth, the connections between informat- 
ics education and informatics practice are being 
strengthened. As electronic health records have 
become ubiquitous in clinical practice, comput- 
ing devices and information technologies have 
become virtually the only tools used by every 
health care provider. Therefore, the importance 
of rigorous training in the use of these tools 
has increased. Informatics training programs 
are now able to train their students with the 


30 


994 J.J. Cimino et al. 


Global Cases 


62,616,821 


O Fig. 30.3 An example of the application of informat- 
ics to increase availability of large data sets and to facili- 
tate their processing for public consumption. Depicted is 
a COVID-19 dashboard developed at the Johns Hopkins 
University Center for Systems Science and Engineering 


living laboratories by studying how research 
and patient care systems are being used and can 
be improved. The long tradition of formal 
informatics training in nursing programs is now 
being adopted in clinical medicine, which has 
recently added clinical informatics as a board- 
certified subspecialty through the American 
Board of Medical Specialties (ABMS). 


30.2 Looking to the Future 


Given the general trends we have outlined, what 
can we expect in terms of specific advances for 
the field? For that, we invited seven visionaries 
to share their predictions for the directions we 
will, or least should, be moving. We chose inno- 
vative thinkers who could provide insights on 
the future of biomedical informatics from a 
variety of perspectives: bioinformatics (Tarczy- 
Hornoch), industry (Horvitz), nursing 
(Murphy), health policy (Blumenthal), aca- 
demic informatics (Frisse), clinical medicine 
(Wachter), and federal government (Brennan). 
Individually, they provide perspectives on gov- 
ernment efforts, policy changes, research 
advances, and clinical practice. Together, they 
weave a rich tapestry that presages how bio- 


Deaths, Recovered 


for presenting COVID-19 data ingested from a variety of 
sources to allow lay people easy access to up-to-date 
information on the COVID-19 pandemic in their area 
(Dong et al. 2020). (> https://www.jhu.edu/. Copyright 
2020, Johns Hopkins University. All rights reserved) 


medical informatics is likely to influence the 
twenty-first century. As it happens, shortly after 
these pieces were written, events have unfolded 
to put these predictions to the test. 

Writing in early 2020, we are referring of 
course of the COVID-19 pandemic. The chap- 
ters in this book were largely written in the 
preceding year or two. Some have been updat- 
ed to discuss the current situation (e.g., see 
> Chap. 18), but no textbook can keep current 
with the rapidly unfolding events of this cur- 
rent natural disaster. The level of public inter- 
est in biomedical information, ranging from 
virology and immunology, to pharmacology 
and epidemiology, has risen to unprecedented 
levels. Up-to-the-minute data are being pro- 
vided with great volume, variety and velocity 
(the hallmarks of “big data — see » Chap. 13) 
through government, academic and news 
media sources for popular consumption 
(O Fig. 30.3).! All of this requires rapid devel- 
opment and delivery of informatics solutions 
on an unprecedented scale, along with 
approaches for confirming the veracity of data. 


1 > https://www.arcgis.com/apps/opsdashboard/ 
index. html#/bda7594740fd40299423467b48e9ecf6 
(accessed 2/13/2021). 


The Future of Informatics in Biomedicine 


The lessons and predictions discussed by 
our guest visionaries can be applied directly to 
the care of patients with suspected or con- 
firmed COVID-19. For example, Tarczy- 
Hornoch (» Box 30.1) describes correlation 
of genomic and functional data with clinical 
outcomes data. Thanks to adoption and 
interoperability of electronic health records 
(> Chaps. 14 and 15), sufficient data are 
becoming available to provide an understand- 
ing of risk factors for disease severity as well 
as benefits risks of putative treatments much 
more rapidly than could be achieved with for- 
mal human subject studies (Xu et al. 2020). 
Horvitz (> Box 30.2) provides an inventory of 
the methods drawn from the field of artificial 
intelligence (see > Chaps. 1 and 24) that stand 
ready to use these data to infer answers to such 
pressing questions through machine learning. 
He also describes supplemental methods for 
“machine teaching” that can be brought to 
bear when some data remain sparse (Feijoo 
et al. 2020). 

Murphy (> Box 30.3) describes the advan- 
tages of tele-visits for improving access to 
care; the social distancing required for helping 
to control the pandemic provides additional 
incentive for care at a distance, even for those 
who otherwise have sufficient access to in- 
person care. Fortunately, the technology of 
tele-visits (> Chap. 20) has progressed to the 
point where healthcare institutions have been 
able to make the necessary transition with an 
ease that would not have been possible 10 
years earlier (Hong et al. 2020). 

Blumenthal (> Box 30.4) enumerates issues 
related to safety of information systems and the 
government’s role in developing policies that 
address privacy concerns (> Chaps. 12 and 29). 
The immediate need for such policies relates to 
the balance between individual rights and the 
protection of the public, as software developers 
race to create patient contact tracing applica- 
tions (Abeler et al. 2020) and patients begin to 
collect their own intimate, detailed data through 
the wearables mentioned by Frisse (see also 
> Chap. 19 and Ding et al. 2020). 

Frisse (> Box 30.5) recognizes that with the 
success of informatics, and its growing impact 
on both science and health, the challenges and 
complexities involved with “doing it right” 
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extend beyond the protection of data privacy. 
Other impacts on society may be both deep 
and unanticipated, with the possibility that 
unintended consequences may exacerbate dif- 
ferences among how different people, with dif- 
ference education, cultures, and financial 
means, may experience health care and manage 
their own health. Unintended consequences of 
technology have been rampant in many fields 
(did anyone anticipate that television would 
engender generations of “couch potatoes”?). 

Wachter, who is well known for his provoca- 
tive book characterizing the “digital doctor” 
who is already somewhat upon us (Wachter 
2017), focuses on the informatician (or infor- 
maticist) of the future and the impact that 
such individuals will have in clinical settings 
(> Box 30.6). He acknowledges the problems 
that are highlighted in his popular book, but 
envisions an ultimately positive future in 
which “the experience of being both patient 
and healthcare professional will be far more 
satisfying”, due in part to the role that infor- 
matics, and those who practice this new spe- 
cialty, will play in the clinical environment. 

Tarczy-Hornoch, Horvitz, Murphy, Blumen- 
thal, Frisse, and Wachter all describe ways in 
which clinicians’ workflow can be influenced 
for the better through informatics, with par- 
ticular attention to their quality of life, which 
the pandemic has demonstrated can require as 
much attention, for some individuals, as do 
the lives of the patients they serve (Dewey 
et al. 2020). And of course, all of these activi- 
ties are supported through access to data and 
literature (including the works cited here), 
made possible by the National Institutes of 
Health, with the National Library of Medi- 
cine at the fore (Zayas-Caban et al. 2020), as 
well as other governmental and nongovern- 
mental organizations. As Brennan notes in 
her perspective (> Box 30.7), the federal gov- 
ernment must provide the resources for the 
things that only it can do (such as gathering 
and consolidating epidemiologic data) and 
collaborate with leaders in the private sector 
that can provide additional breadth and depth 
of expertise. The COVID monitoring dash- 
board shown in B Fig. 30.3 is just the tip of 
an iceberg of such cooperation (in this case 
between the Centers for Disease Control & 
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Prevention and Johns Hopkins University), 
when one considers all the work that went 
into obtaining the underlying data. 

Today, biomedical informatics is moving out 
of the shadows. Instead of hearing “biomedi- 
cal informatics? What’s that?”, we hear “How 
is biomedical informatics helping to solve this 
problem?” There are many answers and 
although they may seem specific to the current 
pandemic, they will remain applicable long 


Box 30.1 A Perspective on the Future 
of Translational Bioinformatics and Pre- 
cision Medicine 

Peter Tarczy-Hornoch 


The first part of the twenty-first century saw the 
establishment of the fields of translational bio- 
informatics (TBI) and precision medicine (PM), 
accompanied by the movement of this research 
into the early T-phases [1] of translational 
research (e.g. TO-T2 research focused on discov- 
ery and early application). The next decade will 
see the fields move into the later T-phases of 
broader adoption and diffusion (T3) and into 
evaluating the population impact in terms of 
health outcomes (T4). Due to shared core meth- 
odologies plus pressures on the health system, 
T3/T4 research will demonstrate a convergence 
between TBI/PM (predicting individual out- 
comes) and integration with more population- 
based approaches such as comparative 
effectiveness research and, more broadly, the 
concept of the learning healthcare system. 

The earliest work in TBI and PM focused on 
the identification of opportunities and new 
approaches (T0), discovery to early health appli- 
cations (T1), and assessment of value (T2). In 
the TBI area, TO work focused on studies pilot- 
ing the combination of both genomic data and 
electronic health record data for discovery (e.g. 
the early phase of the eMERGE project) and 
proof-of-concept T1 translational applications 
of genomic discoveries to clinical care (e.g. tar- 
geted pharmacogenomic decision-support sys- 
tems). We also see an emerging body of T2 
translational research that is beginning to assess 
the value of these new discoveries for health 


after the current challenges are overcome. 
Public recognition of the importance of infor- 
matics will lead to increased resources for 
research, increased interest in education and 
training, and new opportunities for applica- 
tions in biomedicine, in preparation for the 
inevitable challenges we know to anticipate. 
The intent of this book is to prepare those 
who wish to understand, support and lead 
these changes. 


practice and for the development of evidence- 
based guidelines. The validation of genomic dis- 
covery and demonstration of its suitability for 
widespread adoption (T2/T3) is just beginning. 
This evolution can be illustrated on the clinical 
T2 front by the work of the American College of 
Medical Genetics, which monitors new genomic 
discoveries to identify what secondary findings 
in genome and exome sequencing meet criteria 
for reporting [2]. Thus far only selected muta- 
tions of around 60 genes (out of over 20,000 in 
the genome) meet the ACMG’s rigorous criteria 
for clinical reporting. The clinical validation of 
new discoveries facilitated by TBI is a key step in 
the development of informatics tools that apply 
this knowledge to practice (e.g. decision-support 
tools). As an example of informatics T2 work, 
researchers have begun to assess the cost/benefit 
of genomic decision-support tools in the elec- 
tronic health record [3]. The research and appli- 
cation of PM informatics approaches for T1 
discovery and T2 application parallel those of 
TBI (oftentimes incorporating genomic ele- 
ments as part of the input data for the develop- 
ment of predictive models). 

In the coming decade the types and volume 
of data used for TBI and PM discovery and 
application will continue to expand and the dis- 
tinctions between TBI and PM will blur even 
further. As the cost of genome sequencing con- 
tinues to drop, increasing numbers of patients 
will have genotypic information available to cor- 
relate with clinical and other information, which 
will enable both larger scale discovery and appli- 
cation. In the cancer domain, for example, new 
single-cell sequencing approaches will provide 
additional granular data on a specific patient’s 
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clonal mutational profiles. In the metabolomics 
and proteomics areas, the cost of gathering these 
data is dropping, both at the patient and more 
targeted (e.g. organ) levels. The ability to begin 
to correlate these functional data with clinical 
outcome data and with response-to-therapy 
data will provide powerful new biological and 
clinical insights. These new sources of biological 
process data will complement new sources of 
phenotypic and environmental data. Text min- 
ing will enable free-text notes describing pheno- 
type and environment (e.g. social determinants 
of health) to be transformed into more discrete 
data suitable for machine learning. Increasing 
availability of geocoded environmental data (e.g. 
climate data, pollution data, air quality data, 
pollen counts, etc.) will enable cross-links to 
patient data. With patient engagement and per- 
mission (and substantial work on standards and 
security), specific environmental data from the 
Internet of Things (e.g. lighting and temperature 
data in a home) may also be linked with genom- 
ic, biological and clinical data for a patient. 
Similarly other patient data can be integrated, 
such as questionnaire and survey data (patient 
reported outcomes and mMeasures) and data 
from consumer wearables (counts of steps, heart 
rate monitoring, sleep monitoring) as well as 
consumer medical devices (home glucose and 
blood pressure monitoring, and, recently, more 
experimental transdermal monitoring of meta- 
bolic processes). This increase in data about 
individual patients, as well as the number of 
patients for which these rich data are available, 
will greatly accelerate the TO-T1 discovery and 
initial clinical application phases of TBI/ 
PM. The volume of data and potential out- 
comes are such that the informatics tools will 
become ever more important for discovery. 
Already health care providers struggle with 
information overload and with the need to be 
current on new medical discoveries. The antici- 
pated complexity and volume of new findings 
and correlations will be such that computer- 
based decision-support tools will be obligatory 
for application of these new findings. All of 
these approaches will fit into the paradigm of 
using predictions to provide early/preventive 
interventions that are tailored to the unique pro- 
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file of the individual patient. The core methods 
and approaches used for analysis and discovery, 
and the ones used for decision support, will be 
fundamentally similar, whatever mix of input 
variables is used across the spectrum of genetic, 
biologic, clinical, patient provided, or environ- 
mental data. In light of this, the distinction 
between TBI and PM will likely vanish. 

There are number of promising data ana- 
lytics methods currently under development 
that are likely to be useful in the TBI/PM infor- 
matics area. One category is the creation of 
more automated model-selection and tuning 
methods. Without these it will be difficult to 
scale a number of the approaches currently 
being used since they are dependent on the 
involvement of human data scientists. Similarly, 
there is foundational work being done in aca- 
demia and industry that is seeking better unsu- 
pervised learning approaches. These are needed 
because the ability to develop gold-standard 
training sets is now often constrained by the 
amount of human effort required. 

Another broad category is methods that pro- 
vide some explanatory power related to predic- 
tions. As one example, current machine learning 
approaches identify correlation but generally 
cannot provide insight into causation. New 
approaches show promise when they leverage 
large enough data sets to begin to infer causa- 
tion. Another example is using automated tools 
both to develop predictive models and then to 
use artificial intelligence techniques to develop 
an explanatory model. Both these examples 
illustrate ways in which new methods may begin 
to address the concern raised by some overly 
opaque “black box” predictive models. A final 
broad category is methods that begin to leverage 
available data more effectively, including new 
AI-based image-analysis approaches, next-gen- 
eration hybrid statistical and rules-based text- 
mining approaches, and new approaches to 
improve the use of temporal information in pre- 
diction algorithms (e.g. the slope and tempo of 
visits and laboratory values). 

The rapidly rising costs of health care in the 
United States, without a corresponding improve- 
ment in quality, will influence the development 
of informatics tools for precision medicine. 
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Broadly this will mean that work in the TBI/PM 
informatics space will need to factor in the per- 
spectives of the Quadruple Aim: (1) enhancing 
patient experience, (2) improving population 
health, (3) reducing costs, and (4) improving the 
work life of healthcare providers. Regarding the 
first of these, it will be important to ensure that 
predictive model-based decision-support tools 
are built in ways to ensure the pursuit of shared 
decision making involving the patient. Tools 
must also provide the appropriate support for 
ensuring that behavioral changes occur (e.g. if a 
model predicts the need for increased aerobic 
exercise, there must be methods to ensure that 
occurs). It will similarly be important to ensure, 
as data are shared and models are developed, 
that attention is paid to ethical, legal and social 
aspects of data sharing. This will help to main- 
tain the trust of patients and to avoid unintend- 
ed biases in the models (consider, for example, 
the recent issues with facial recognition software 
that works well on white males but not on wom- 
en of color). The ethical lens will be particularly 
important to ensure that the privacy and trust of 
patients and public are preserved as these large 
scale data-intensive methods are developed and 
deployed. The recent academic and popular 
press discussions of the breeches of trust by large 
scale social media and other Internet companies 
should serve as a cautionary tale. 

Regarding the next two elements in the 
Quadruple Aim, it will be critical that the work 
in TBI/PM be subject to the same kinds of 
assessments that we expect for other diagnostic 
and therapeutic interventions. Currently there 
are deployed tools that demonstrate a sensitiv- 
ity and specificity that are far below the values 
that we would otherwise demand of diagnostic 
and screening tests. Informatics interventions 
have not been treated in quite the same way as 
laboratory tests or medications. Efforts to dem- 
onstrate value and real-world impact of TBI/ 
PM tools will align with broader efforts to dem- 
onstrate effectiveness in the real world (e.g. 
Comparative Effectiveness Research). They 
will also form a key aspect of the Learning 
Healthcare System approach, since TBI/PM 
tools will help to assure that learning can occur 
from analysis of the data artifacts generated in 


the care-delivery process (e.g. the electronic 

health record and related data). 

Finally, in order to address the fourth ele- 
ment in the Quadruple Aim, we will need to 
determine how best to deploy predictive ana- 
lytics tools. It will be important to preserve 
provider decision-making autonomy, to pro- 
vide sufficient explanatory ability and rigor- 
ous validation to ensure that providers trust 
the results, and to diminish the information 
and alert overload that providers face today. 

In summary, we have just begun to see TBI 
and PM informatics discoveries and applica- 
tions have an impact on achieving the broader 
goals of improving health and the more focused 
goals of the Quadruple Aim. Over the next 
decade, with advances in data analytics methods 
and increasing sources of data regarding an 
increasing number of patients, we are likely to 
see remarkable progress in the development of 
more easily developed and more accurate pre- 
dictive models that will allow us to intervene at 
the patient level. These advances will be inte- 
grated into the broader trends in health care as 
encapsulated in the Quadruple Aim, which will 
require additional research and innovation to 
ensure that the full potential of TBI and PM are 
realized. 
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Box 30.2 The Future of Biomedical 
Informatics: Bottlenecks and Opportu- 
nities 

Eric Horvitz 


I see the rich, interdisciplinary field of biomed- 
ical informatics as the gateway to the future of 
health care. The concepts, methods, rich histo- 
ry of contributions, and the aspirations of bio- 
medical informatics define key opportunities 
ahead in biomedicine—and shine light on the 
path to achieving true evidence-based health 
care. 

Progress with influences of biomedical 
informatics on health care over the last three 
decades has been slower than I had hoped. 
However, I remain optimistic about a forth- 
coming biomedical informatics revolution, 
made possible by a confluence of advances 
across industry and academia. Such a revolu- 
tion will accelerate discovery in biomedicine, 
enhance the quality of health care, and reduce 
the costs of health care delivery. 

From my perch as an investigator and direc- 
tor of a worldwide system of computer science 
research labs, I view key opportunities ahead as 
hinging on (1) addressing the often underap- 
preciated bottleneck of translation—moving 
biomedical informatics principles and proto- 
types into real-world practice, and (2) making 
progress on persisting challenges in principles 
and applications of artificial intelligence (AI). I 
am optimistic that we will make progress on 
both fronts and that there will be synergies 
among these advances. 

On challenges of translation, I believe that 
the difficulties of transitioning ideas and imple- 
mentations from academic and industry 
research centers into the open world of medical 
practice have been widely underappreciated. 
Numerous factors are at play, including poor 
understanding of how computing solutions can 
assist with the tasks and day-to-day needs of 
health care practitioners and patients, inade- 
quate appreciation of the needs and difficulties 
of developing site-specific solutions, poor com- 
pute infrastructure, and a constellation of chal- 
lenges with human factors, including entrenched 
patterns of practice and difficulties of integrat- 
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ing new capabilities and services into existing 
clinical workflows. 

Multiple advances coming with the march 
of computer science will help to address chal- 
lenges of translating ideas and methods that 
have been nurtured by biomedical informati- 
cians for decades. At the base level, such 
advances include ongoing leaps in computing 
power and in storage, but also key innovations 
with computing principles and methods in such 
subdisciplines as databases, programming lan- 
guages, security and privacy, human-computer 
interaction, visualization, and sensing and 
ubiquitous computing. 

Faster and more effective translation of 
ideas and methods from biomedical informat- 
ics will also be enabled by jumps in the quality 
of available computing tools and infrastruc- 
ture. Increases in the power and ease-of-use of 
cloud computing platforms are being fueled by 
unprecedented investments in research and 
development by information technology com- 
panies—companies that are competing intense- 
ly with one another for contracts with 
enterprises that are hungry for digital transfor- 
mation and the latest in modern computing 
tools. Cloud computing companies are packag- 
ing in their offerings sets of development tools 
and constellations of specialized services. Many 
of these offerings are relevant to biomedical 
informatics efforts, including machine learning 
toolkits, suites for analysis and visualization of 
data, and computer vision, speech recognition, 
and natural language analysis services made 
available via programmatic interfaces. 

Beyond developing generic platform capa- 
bilities, cloud service providers are motivated to 
gain understandings in key vertical markets, 
such as health care, finance, and defense, and 
have been working to custom-tailor their gen- 
eral platforms with tools, designs, and services 
for use in specific sectors. For example, there is 
incentive to support rising standards on sche- 
mata (e.g., Fast Healthcare Interoperability 
Resources (FHIR)) for storing and transferring 
electronic health records and on methods to 
ensure the privacy of patient data. There is also 
pressure to develop special versions of comput- 
ing services for medicine, such as language 
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models and, more generally, natural language 
capabilities specialized for medical terminolo- 
gy, enabling more accurate understanding and 
analysis of medical text and speech. Competitor 
cloud providers have also worked to identify 
and provide efficient methods and tools for 
important vertical needs, such as the rising 
importance of determining DNA sequences 
and interpreting protein expression data. Such 
special needs of researchers and clinicians have 
led to the availability of efficient and inexpen- 
sive cloud-computing services for genomic and 
proteomic analyses. 

Moving on to the second realm of opportu- 
nities, around harnessing advances in the con- 
stellation of technologies that we call AI, I 
believe that our community can do more to 
leverage existing methods and also to closely 
follow, push, and contribute to advances in AI 
subfields. Beyond methods available today, key 
developments will be required in principles and 
applications to realize the long-term goals of 
biomedical informatics. I am seeing good prog- 
ress and am optimistic that the advances com- 
ing over the next decade will be deeply enabling. 

On existing technologies, and focusing on 
the example of developing effective decision 
support systems, we have been very slow to 
leverage the visionary ideas proposed by Robert 
Ledley and Lee Lusted in 1959 [1]. Ledley and 
Lusted provided a blueprint for constructing 
differential diagnoses and to use decision-theo- 
retic analyses to generate recommendations for 
action. Biomedical informatics investigators 
have been top leaders with exploring proto- 
types for decision support systems, and systems 
constructed over 60 years of research have been 
shown to perform at expert levels. However, 
real-world impact has been limited to date. A 
key bottleneck has been the scarcity and cost of 
expertise and data. I believe that harnessing 
advances in machine learning will be particu- 
larly critical for delivering on the vision of evi- 
dence-based clinical decision making. Machine 
learning techniques available today can and 
should be playing a more central role in health 
care for assisting with pattern recognition, 
diagnosis, and prediction of outcomes. There 


are multiple opportunities to build and to inte- 
grate pipelines where data flow via machine 
learning to predictions and via automated deci- 
sion analyses to recommendations about test- 
ing and treatment. Making key investments to 
build and refine effective data-to-prediction-to- 
decision pipelines will provide great value in 
multiple areas of medicine [2]. 

Opportunities ahead for biomedical infor- 
matics include leveraging recent advances in 
deep learning in medical applications, especially 
for image recognition and natural language 
tasks. These multilayered neural network archi- 
tectures are celebrated for providing surprising 
boosts in classification accuracy in multiple 
application areas and for easing engineering 
overhead, as they do not require special feature 
engineering. The methods have been shown to 
perform well for recognition in the image-centric 
areas of pathology and radiology. Different vari- 
ants of deep learning are also being explored for 
building predictive models from clinical data 
drawn from electronic health records. Beyond 
direct applications, deep learning methods have 
led to enhanced capabilities in multiple areas of 
AI with relevance to goals in biomedical infor- 
matics, including key advances in computer 
vision, speech recognition, text summarization, 
and language translation. 

With all of the recent fanfare about deep 
learning, it is easy to overlook the applicability 
of other machine learning methods, including 
probabilistic graphical models, generalized 
additive models, and even logistic regression for 
serving as the heart of predictions in recom- 
mendation engines. While excitement about 
deep learning is appropriate, it is important to 
note that the methods typically require large 
amounts of data of the right form and that such 
datasets may not be available for medical appli- 
cations of interest. Other approaches have 
proven to be as accurate for clinical applications 
and provide other benefits such as providing 
more intelligible, explainable inferences. Also, 
when sufficiently large corpora of data labeled 
with ground truth are not available, knowledge 
acquisition techniques, referred to broadly as 
machine teaching, can provide value. While work 
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is moving forward on machine teaching, exist- 
ing methods and tools can be valuable in build- 
ing models for prediction and classification. 

I believe that it is important to note that hav- 
ing access to powerful machine learning proce- 
dures may be insufficient for addressing goals in 
biomedical informatics. Key challenges for mov- 
ing ahead with developing and deploying effective 
decision support systems include identifying 
where and when such systems would provide val- 
ue, collecting sufficient amounts of the right kind 
of data for applications, developing and integrat- 
ing automated decision analyses to move from 
predictions to recommendations for action [2], 
maintaining systems over time, developing means 
to build and apply learned models at multiple 
sites, and addressing human-factors, including 
formulating means for achieving smooth integra- 
tion of inferences and recommendations into 
clinical workflows, and providing explanations of 
inferences to clinicians [3]. Providing explanations 
of predictions generated by machine-learned 
models is a topic of rising interest [4]. I hope to 
see revitalized interest and similar enthusiasm 
extended to addressing challenges identified in 
biomedical informatics with the intelligibility 
and explanation of the advice provided by other 
forms of reasoning employed in decision support 
systems, including logical, probabilistic, and deci- 
sion-theoretic inference [5]. 

Key opportunities in AI research for prog- 
ress with developing and fielding effective 
decision support systems include efforts in 
principles and applications of transfer learn- 
ing, unsupervised learning, and causal inference. 
Transfer learning refers to methods that allow 
for data or task competencies learned in one 
area to be applied to another [6]. Unsupervised 
and semi-supervised learning refers to meth- 
ods that can be used to build models and per- 
form tasks without having a complete set of 
labeled data, such as labels about the final 
diagnoses of patients when working with elec- 
tronic health records data. Causal inference 
refers to methods that can be used to identify 
causal knowledge, versus statistical associa- 
tions that are commonly inferred from data. 
Advances in these areas promise to provide 
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new sources of biomedical knowledge, and to 
address the challenge of data scarcity and 
related difficulties with the generalizability of 
data resources for health care applications. 

On data scarcity and generalizability, an 
important, often underappreciated challenge 
in biomedical informatics is that the accuracy 
of diagnosis and decision support may not 
transfer well across institutions. In our work 
at Microsoft Research, we found that accura- 
cies of a system trained on data obtained from 
a site can plummet when used at another loca- 
tion. The poor generality of datasets is based 
on multiple factors, including differences in 
patient populations—with site-specific inci- 
dence rates, covariates, and presentations of 
illness, site-specific capture of evidence in the 
electronic health record, and site-specific defi- 
nitions of signs, symptoms, and lab results. As 
an example, we found site-specificity when my 
team studied the task of building models to 
predict the likelihood that patients being dis- 
charged from a hospital would be readmitted 
within 30 days. The accuracy of prediction for 
a model learned from a massive dataset drawn 
from single large urban hospital dropped 
when the model was applied at other hospitals. 
This observation of poor generalizability was 
behind our decision to develop a capability for 
performing automated, recurrent machine 
learning separately at each site that would rely 
on local data for predictions. This local train- 
and-test capability served as the core engine of 
an advisory system for readmissions manage- 
ment, named Readmission Manager, that was 
commercialized by Microsoft. 

Moving forward, research on a set of meth- 
ods jointly referred to as transfer learning may 
help to address challenges of data scarcity and 
generalizability. Transfer learning algorithms 
for mapping the learnings from one hospital to 
another show promise in medicine [6]. Such 
methods include multitask learning. Also, 
obtaining spanning datasets, composed of 
large amounts of data drawn from multiple 
sites, may provide effective generalization. In 
support of this approach, methods called mul- 
tiparty computation have been developed that 
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can enable learning from multiple, privately 

held databases, where there is no violation of 

privacy among the contributing organizations. 

Beyond the daily practice of health care, 
and uses in such applications as diagnosis and 
treatment, methods for learning and reasoning 
from data can provide the foundations for new 
directions in the clinical sciences via tools and 
analyses that identify subtle but important sig- 
nals in the fusing of clinical, behavioral, envi- 
ronmental, genetic, and epigenetic data. I see 
many directions springing from applications of 
machine learning, reasoning, planning, and 
causal inference for health care delivery as well 
as in supporting efforts in health care policy 
and in the discovery of new biomedical under- 
standings. 

I remain excited about advances in biomed- 
ical informatics and see a biomedical informat- 
ics revolution on the horizon. Such a revolution 
will build on the glowing embers of decades of 
contributions and the flames of late-breaking 
activities that address long-term challenges and 
bottlenecks. 
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Box 30.3 The Future of Nursing Infor- 
matics 
Judy Murphy 


The focus of this commentary is on the future 
of biomedical informatics from a nursing per- 
spective, but it is helpful to understand the 
background and history of nursing’s role in 
the field. Starting there, the focus will move to 
looking at nursing informatics today and then 
looking to the future of the field from a nurs- 
ing point of view. 

Nurses have contributed to the purchase, 
design, and implementation of health infor- 
mation technology (IT) since the 1970s. The 
term “nursing informatics” (NI) first appeared 
in the literature in the 1980s [1-3]. The defini- 
tion of NI has evolved ever since, molded by 
maturation of the field and influenced by 
health policy. In a classic article that described 
its domain, NI was defined as the combination 
of nursing, information, and computer sci- 
ences to manage and process data into infor- 
mation and knowledge for use in nursing 
practice [4]. Nurses who worked in NI during 
that time were pioneers who often got into 
informatics practice because they were good 
clinicians, were involved in IT projects as edu- 
cators or project team members or were just 
technically curious and willing to try new 
things. Their roles, titles, and responsibilities 
varied greatly. 

A solid foundation for the NI profession 
continued to be laid over the ensuing 40 years. 
Today, informatics has been built into under- 
graduate nursing education and there are over a 
hundred schools offering post-graduate NI 
education. NI is recognized as a specialty by 
the American Nursing Association (ANA) and 
has a specialty certification [5]. NI is now 
described as the specialty that integrates nurs- 
ing science with multiple information and ana- 
lytical sciences to identify, define, manage, and 
communicate data, information, knowledge, 
and wisdom in nursing practice. NI supports 
nurses, consumers, patients, the interprofes- 
sional healthcare team, and other stakeholders 
in their decision-making in all roles and in all 
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settings to achieve desired health and health- 
care outcomes. This support is accomplished 
using information structures, information pro- 
cesses, and information technology [6]. 

NI continues to grow. In the most recent 
Health Information and Management Systems 
Society (HIMSS) NI Workforce Survey, 57% of 
respondents held a post-graduate degree in 
nursing or nursing informatics and 44% were 
specialty certified by ANA in NI or other nurs- 
ing specialty. Another 32% were currently pur- 
suing NI certification, and over half have been 
working in an informatics role for more than 7 
years [7]. 

Since the HITECH Act of 2009, nursing 
informatics specialists have played a pivotal 
role in influencing the adoption of electronic 
health records (EHR) for meaningful use. 
Having the breadth and depth of healthcare 
knowledge and understanding clinical practice 
workflows, nurses help all clinicians understand 
the application and value of the EHR. Nurses 
have a perspective of the many venues of care 
and working with all care team members, as 
well as working with patients at different points 
in their care continuum. Nurses help the patient 
utilize health IT to improve engagement in 
their own care, take control of their own health 
and become an integral part of the decision- 
making process and care team. As patient 
advocates, nurses understand the power of the 
patient in a participatory role and how this can 
improve outcomes. 

The type and quality of care that nurses 
provide to their patients will benefit immensely 
from the continued advancement of technology 
and informatics in healthcare. Although there 
are many ways those advancements will impact 
nursing, here are two areas that hold the great- 
est promise for nursing’s future. 

Data and the Continuous Learning Health 
System: Nursing research has not been as pro- 
lific as medical research, so there is a lot less 
known about the true impact/outcomes of 
nursing interventions. But now that organiza- 
tions are aggregating health data electronical- 
lyin an EHR and other Health IT, nurses can 
more easily identify practices that measurably 
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impact individuals by mining the data and 
using prescriptive, predictive and cognitive 
analytics to correlate actions to improved out- 
comes. The collection, summarization and 
analysis of data can be from multiple venues 
and sources, including social determinants 
and patient-generated information for person- 
alization. Then, it’s not just about impacting 
traditional care, but about the impact across 
the continuum for the individual and includ- 
ing public health and population health man- 
agement. The learnings can be iterated back 
into nursing practice in months instead of 
years, using protocols/guidelines, documenta- 
tion templates, and clinical decision support — 
making it easier to do the right thing and 
‘hard-wiring’ new best practices — thus, creat- 
ing a continuous learning health system. 

Care Coordination and Healthcare Anywhere: 
The advancement of technology has provided us 
the opportunity to provide care anytime/any- 
where and there’s little question that both 
patients and providers are increasingly drawn to 
the concept of healthcare services that are vir- 
tual. This includes “visits” using communication 
technologies such as email, phone and videocon- 
ference, as well as telehealth technologies for 
remote monitoring and management of condi- 
tions or chronic disease. Coupling this with 
engaged patients using portals and mobile apps 
creates a new ecosystem for nurses and their 
patients to interact. Care coordination between 
venues of care and across the continuum will be 
directly impacted in a positive way. As nurses 
have primary responsibility for coordinating 
care and helping patients navigate the complexi- 
ties of the healthcare system, this will be a way 
for them to extend their reach to more patients 
and to improve the quality of the care provided 
to each patient. Nurses can more easily close 


care gaps for preventive and disease manage- 

ment services, monitor patients’ conditions while 

they live their lives and not just when they visit a 

healthcare facility, and provide consulting and 

educational services. 

The future of nursing informatics has no 
bounds; technologies of all kinds will continue 
to evolve, and informatics will help nurses both 
integrate new technologies into their practice as 
well as manage the impact of new technologies 
on that practice. Informatics will help invent 
the future of nursing care transformation. 
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Box 30.4 Biomedical Informatics: The 
Future of the Field from a Health Policy 
Perspective 

David Blumenthal 


Policy issues and developments in the United 
States will be vital to the evolution and effica- 
cy of health information technology (HIT) in 
the future. This is true because health policy 
has made HIT a mainstream feature of the 
U.S. health care system and a vital tool for 
improving it. 

Two types of health policy issues will vital- 
ly affect the future of HIT, its uses and its ben- 
efits. The first type is generic to the U.S. health 
care system but will indirectly affect how HIT 
evolves. The second type of policy issue focus- 
es particularly on HIT. 

Generic policy issues include payment 
reform and the push toward consumer empow- 
erment. There is an urgent need for payment 
reform to address issues such as the high costs 
associated with the U.S. health system. HIT 
has the potential to be a powerful tool in 
health system improvement but whether that 
potential is exploited will depend on the needs 
and priorities of its users, especially health 
care providers. In a fee-for-service environ- 
ment, where volume and revenue maximiza- 
tion are prioritized, purchasers of HIT will 
demand that it serve these purposes. The 
requirement to capture detailed information 
for billing purposes will be paramount to the 
design and configuration of electronic health 
records (EHRs) and other IT. Information 
systems will be used to assure that providers 
capture every billable service in a way that 
maximizes revenue collected. 

Payment approaches that prioritize value 
will favor different HIT configurations, espe- 
cially if those payment methods hold providers 
accountable through risk-sharing for the cost 
and quality of services. HIT will have to facili- 
tate the capture and reporting of quality and 
cost information for the purpose of demon- 
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strating the value of services provided and to 
manage resource use continuously over a 
reporting period. Interoperability and exchange 
of health care data will become a business 
imperative to the extent that accountable pro- 
viders must absorb the costs of services pro- 
vided to their patients at other health care 
facilities in their communities. 

HIT for value maximization will also put 
much greater emphasis on improving clinical 
decisions so as to enhance the value of services 
performed. In a value-oriented environment, 
usable and helpful decision support will achieve 
a priority it has never had in the current fee-for- 
service environment. Another priority will like- 
ly be the capability to assess the comparative 
performance of clinicians within organizations 
so as to evaluate reasons for variation in 
decision-making and health care outcomes. 

A bipartisan interest in making health care 
markets more competitive and responsive to 
patients’ needs is also motivating a push 
toward patient empowerment through sharing 
electronic data with patients and their fami- 
lies. This movement is reflected in legislation 
and regulations that encourage providers to 
share EHR data with individuals or their des- 
ignated third parties. The Office of the 
National Coordinator for Health Information 
Technology (ONC) issued a rule in 2015 that 
requires certified EHRs to have standardized 
application programming interfaces (APIs) 
[1], which will facilitate access to EHR data by 
patients and their agents. A new ONC rule, 
proposed in the spring of 2018 [2], would also 
discourage so called information blocking. 

The growing interest in data-sharing with 
patients is also apparent in Apple’s decision 
to work with 13 prominent health systems [3] 
to accept their patients’ EHR data. Large, 
innovative technology companies like Apple 
may be able to support patient empowerment 
by fashioning user-friendly applications that 
use patients’ data to inform their decision- 
making. 
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The emergence of such applications will 
raise a host of policy issues. Finding ways to 
assure the safety of these consumer-facing 
applications will be a critical part of consumer 
empowerment, and constitutes a key policy 
agenda. To this end, the Food and Drug 
Administration (FDA) is making an effort to 
adjust its traditional regulatory approaches for 
the special circumstances of HIT applications. 

One example of their efforts is the Accelerated 
Digital Clinical Ecosystem (ADviCE), a partner- 
ship between the University of California, San 
Francisco, several other universities and health 
systems, and the FDA to share best practices and 
data for using, integrating, and deploying health 
technology services and applications. ADviCE 
will make recommendations on the types of data 
needed, data sharing, transparency, and use. 

Policymakers must also find ways to protect 
privacy of patients, either through enforceable 
voluntary standards or governmental regula- 
tion of emerging private organizations, like 
Apple, that play the role of data stewards. 

Some HIT specific policy issues are also 
likely to influence the future development of 
health information technology. On this front, 
the increased use of EHRs has also given rise 
to safety challenges, as enumerated in a recent 
report from the Pew Charitable Trusts [4]. For 
example, patients may receive the incorrect 
dose of a medication or clinicians may select 
the wrong person when inputting an order. 
These safety issues are probably linked with 
the usability of EHRs, and suggest the need 
for improved user-centered design focused on 
the needs of both clinicians and patients. The 


EHR certification process will likely play a 
role in pursuing improved safety of patient 
data. 

To address these and other health IT safety 
concerns, multiple experts have proposed the 
establishment of a safety collaborative com- 
posed of EHR developers, hospitals, govern- 
ment, health practitioners, and other key 
organizations to work together to resolve safety 
problems. 

Finally, policy interventions may be required 
to improve equity of access to benefits of HIT in 
rural areas and for underserved populations. 
Lack of connectivity and sophisticated technical 
support can handicap rural providers in their 
efforts to use advanced HIT. 

With the increasing power of HIT in health 
care will come increased reliance on its capa- 
bilities for responding to policy challenges, 
both general and HIT-specific. For the most 
part, these challenges will stimulate evolution 
in HIT design that makes it even more useful 
and important for the future of our health care 
system, and its patients and providers. 

1. » https://www.healthit.gov/sites/default/ 
files/facas/HITSC_Onc_2015_edition 
final_rule_presentation_2015-11-03.pdf 

2. > https://www.reginfo.gov/public/do/ 
eAgendaViewRule?pubId=201804& 
RIN=0955-AA01 

3. » https://hbr.org/2018/03/apples-pact- 
with-13-health-care-systems-might-actu- 
ally-disrupt-the-industry 

4. » https://www.pewtrusts.org/en/research- 
and-analysis/reports/2017/12/improving- 
patient-care-through-safe-health-it 
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Box 30.5 Future Perspective 
Mark Frisse 


As this textbook demonstrates, biomedical 
informatics paradigms are changing. Original 
paradigms of necessity were moored by an envi- 
ronment where data sets were small; data stor- 
age was limited; computation required massive 
and costly hardware; and high-bandwidth net- 
work connections were rare. Most major bio- 
medical research was conducted in large 
laboratories and, with a few exceptions, compu- 
tational needs were limited. Health care delivery 
and clinical research generally took place in 
hospitals and large clinics both affiliated with 
medical schools and endowed with talent, reve- 
nues, and capital necessary for their successful 
operation. Payment models and reimbursement 
for health care operations took place behind the 
scenes without excessive complexity and with 
few burdens on providers. Public health work- 
ers, health policy researchers and related groups 
had access only to selective, retrospective, and 
often manually-collected data and limited ana- 
lytic capabilities. Informatics was a select, 
expensive, and time-consuming endeavor. 
Despite great challenges, remarkable feats were 
accomplished. 

Recent paradigms are untethered from 
many early constraints. Today, data sets are 
massive and plentiful; data storage is inexpen- 
sive and seems unlimited; computation is ubiq- 
uitous and extends from minute sensor devices 
to massive cloud-based virtual machines; high- 
bandwidth network connections are pervasive 
and central to American life. The range of bio- 
medical research activities is far broader and is 
constrained more by funding and talent limita- 
tions than by facilities; and computation is not 
only central to traditional research approaches 
but has extended the reach of scientific investi- 
gation dramatically through the analysis of 
data sets ranging from molecules to genomes. 
Largely because of Internet-based services, 
more providers and other care givers have 
access to information they need. Public health 
workers, health policy researchers and other 
interest groups can access large and broad data 
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sets collected in near real-time. Patients and 
their families can also access much more infor- 
mation and are have become truly central to 
health care; patients are speaking up, and our 
health system is listening. 

Other academic disciplines, once working 
on the periphery of biomedical informatics, are 
converging and taking center stage. Social sci- 
entists explore care complexity both in delivery 
settings in and in the home. For example, cogni- 
tive scientists seek more effective and efficient 
ways of managing care tasks. Operations 
research professionals seek to improve patient 
access, scheduling, workflow analysis, capacity 
management, throughput, and systems science. 
Behavioral psychologists are studying how 
mobile technologies can “nudge” patients and 
providers into better behaviors. As a result, 
informatics has become even more imaginative, 
extensive, rigorous, broad, accessible, and inex- 
pensive. 

The accomplishments have been many, the 
future seems bright, and the potential for soci- 
etal good is promising. But, to paraphrase nov- 
elist William Gibson, this bright future is not 
now nor will it quickly become evenly distrib- 
uted. Both in biomedicine and in society at 
large, new paradigms and technologies trans- 
formed commerce, interpersonal communica- 
tion, social interactions, and behaviors have 
upended almost every aspect of society. By inte- 
grating and analyzing the multiple data streams 
emerging from our personal behavior, commu- 
nication, reading habits, purchasing patterns, 
and social interactions, data and algorithms are 
capable of startlingly accurate predictions that 
in turn can profoundly influence behavior. The 
velocity of these changes carries biomedical 
informatics — and all of society — into an uncer- 
tain future full of promise and peril. 

Consider the American healthcare system. 
The United States has the highest per capita 
expenditures for health care in the world, yet, by 
many measures, important health care quality 
measures lag far behind these of other countries 
[1]. Despite significant advances in technology 
and clinical informatics, this trend continues. 
This may be due in part because technologies are 
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not capable only of reducing complexity, they 
are also capable of introducing additional com- 
plexity whether such complexity is warranted or 
not. One cannot argue against effective infor- 
matics support for prescribing decisions; biology 
and the clinical condition warrant extreme detail 
to complexity. Similarly, knowledge of total and 
out-of-pocket drug costs would be helpful if 
patients were presented with choices, but it is dif- 
ficult to rationalize the hundreds (if not thou- 
sands) of different formularies imposed by 
health plans. One can argue that effective mea- 
surement of outcomes and care metrics is essen- 
tial for demonstrably increasing quality of care, 
but the value of many quality metrics is uncer- 
tain and the administrative burdens imposed on 
clinicians who must collect these data borders on 
the intolerable, often coming at the expense of 
patient interaction. As the economist Uwe 
Reinhardt wrote: “I have been at many confer- 
ences at which concerned clinicians explore so- 
called ‘evidence-based medicine, replete with 
“evidence-based best-clinical-practice guidelines’ 
and the associated ‘clinical pathways.’ I cannot 
recall a conference on the topic of “‘evidence- 
based best administrative practices, (although I 
may have missed it.)” [2]. 

Consider the future role of the traditional 
institution-centric electronic health record. 
Federal incentives greatly accelerated the intro- 
duction of EHRs into hospitals and clinics and 
made transactions like e-Prescribing routine. 
Data and communications standards allow com- 
munication across different clinical systems and 
expand capabilities for medication management, 
care coordination and other clinical activities 
outside of hospitals and clinics. Common EHR 
data elements and organizational data ware- 
houses are simplifying secondary data use for 
quality reporting, administration, population 
health, research, and other uses. Web portals, 
mobile communications, and patient-accessible 
EHRs are engaging patients and their families to 
a greater degree. But this rapid introduction of 
EHRs has been a mixed blessing. Critics claim 
that EHRs focus on administrative and payment 
at the expense of providing the cognitive support 
patients and clinicians desperately need. EHRs 


cannot simply continue their current approach at 
the expense of providing the cognitive support 
patients and clinicians desperately need. To 
improve clinician morale and productivity, the 
urge to introduce even further unnecessary 
administrative burdens on care providers must 
be resisted. 

Given the many turbulent transformations 
in care delivery methods, care delivery organiza- 
tions, and patient-centered health technologies, 
many clinical informatics advances will be the 
realized through extension of traditional EHRs 
and still others will be the product of experi- 
mentation with clinical technologies address 
immediate consumer-directed needs and view 
EHR connectivity as secondary objective. Since 
both models will be introduced, evaluated, and 
adopted, one must understand how informatics 
can influence the evolution of many different 
types of clinical systems. 

The ascendancy of data science has been a 
central theme of biomedical informatics. Broadly 
construed, these activities expand fundamental 
biomedical informatics activities through the 
introduction of new technologies and techniques. 
Findings emerging from increasingly interopera- 
ble clinical databases like i2b2, OMOP, and 
PCORNet further stimulate essential large-scale, 
collaborative data standardization and ontology 
development. These in turn will simplify the 
inclusion of a broader array of personal, environ- 
mental, and biologic computable knowledge 
structures. Machine learning and related disci- 
plines arising from these activities foster discovery 
of previously unknown medication interactions, 
genetic propensities, behavioral risks, predictions, 
and actionable care interventions. 

Social networks and other forms of infor- 
mal communication are having similar impacts. 
In principle, these networks can gather isolated 
individuals sharing common concerns and can 
reinforce positive behaviors and combat imped- 
iments to health — social isolation, misinforma- 
tion, and costs. Some forms of “digital group 
therapy” or “group telemedicine” may be par- 
ticularly well-suited in these circumstances. 

A dazzling array of new technologies must 
also be understood and when appropriate intro- 


The Future of Informatics in Biomedicine 


duced into clinical research and care delivery. 
The collection, integration, and analysis of new 
data streams produced by these devices are 
already being used to manage diet, weight, exer- 
cise, and even cardiac rhythm problems. 
Untethered from traditional EHRs, these prod- 
ucts are producing new and valuable sources of 
ambiently-collected data at lower costs. 

Speech and gesture recognition will simplify 
human-computer interaction. Ambient data col- 
lection methods simplify collection of routine 
data and provide additional context for docu- 
mentation and interpretation. Clinician-computer 
interactions may be unobtrusive and allow great- 
er focus on patients rather than computer screens. 
Ambient data collection — including video inter- 
pretation of clinician — patient interactions may 
be used to more completely summarize the clini- 
cal encounter. Image recognition technologies 
can diagnose skin disorders, radiographs, and 
some other medical images. Machine learning 
algorithms will reliably screen for abnormalities 
and complement human judgement. 

We cannot can fully control how innova- 
tions will be adopted, nor can we predict their 
societal impact. Informatics — and innovation 
more broadly — is a two-edged sword. 

For example, clinical systems have improved 
care, reduced costs, and contributed to new 
insights through translational informatics and 
data science. At the same time, they have added 
considerably to administrative burdens and cost, 
and in practice, may emphasize administrative 
tasks over the critical cognitive work that is the 
foundation of clinical medicine. At the clinical 
and policy level, efforts to simplify programs 
and processes become even more important. 

Similarly, social networks and telemedicine 
allow previously isolated individuals to rein- 
force possibly socially objectionable attitudes 
or behaviors. But these same networks can rap- 
idly distribute and reinforce exaggerated or 
false claims about the efficacy of vaccinations, 
treatments, and scientific evidence; these prac- 
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tices challenge society’s very idea of a common 

truth. 

Advances in data science and analytics, 
when combined with sensors and devices on the 
person, in the home or in public spaces raise 
fears that “someone/something is always 
watching.” If data are aggregated and used by 
an unauthorized “data-industrial complex” 
working outside of socially acceptable norms, 
privacy rights are threatened. Better means of 
anonymizing data and more realistic privacy 
and data use policies will become even more 
important. 

Although paradigms change, an emphasis 
on data, information, knowledge, and effective 
use remains foundational. A primary responsi- 
bility of biomedical information is to ensure that 
everything from data generation to knowledge 
generation is continually improving through 
greater consistency and efficiency. These 
improvements in turn should result in systems 
that more effectively address real needs and not 
merely automate flawed behaviors or practices. 
Our future depends on the extent to which we 
can introduce efficient means of presenting 
needed, reliable, and consistent information and 
the extent to which our efforts ensure better out- 
comes for individuals and society. To be effec- 
tive, informatics professionals proceed based on 
their experience, knowledge, and values. They 
must, in other words, practice wisdom. 

1. Schneider, E. C., Sarnak, D. O., Squires, D., 
Shah, A., & Doty, M. M. (2017). Mirror, mir- 
ror 2017: International comparison reflects 
flaws and opportunities for better U.S. Health 
Care. Commonwealth Fund. » http://www. 
commonwealthfund.org/interactives/2017/ 
July/mirror-mirror/. 

2. Reinhardt, U. E. (2013, September 13). 
Waste vs value in American Health Care. 
New York Times. » https://economix.blogs. 
nytimes.com/2013/09/13/waste-vs-value-in- 
american-health-care/. 
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Box 30.6 The Future of Health IT: 
A Clinical Perspective 
Robert M. Wachter 


About a decade ago, I hired a young clinical 

informaticist for a faculty position at UCSF. I 

told him he had an incredibly bright future, 

since we would soon implement a well-respect- 
ed vendor-built electronic health record (EHR). 

I was confident that this would be exciting and 

important work, work that would keep him 

fully employed for years to come. 
I didn’t share with him my worry: what 
would his job be after the EHR was installed? 
Needless to say, despite the fact that our 

EHR has been up and running for 6 years, he 

remains gainfully employed. In fact, he is busi- 

er than ever. His experience taught me some- 
thing I did not understand at the time: the 
implementation of the EHR is merely the first 
step in the process of extracting value from 
healthcare digitization. 

In fact, I have come to see the process of 
digitization as involving four steps: 

1. Digitizing the record 

2. Connecting all the digital parts (“interoper- 
ability”) 

3. Gaining insights from the digital data now 
being generated by and traversing the sys- 
tem 

4. Taking advantage of digitization to build 
and/or implement new tools and approach- 
es that deliver healthcare value (improving 
quality, safety, patient experience, access, 
and equity while also lowering costs and 
improving efficiency and productivity) 


In the United States, the $30 billion of 
incentive payments distributed by the govern- 
ment under the HITECH Act from 2010-2014 
succeeded in achieving the first step — nearly all 
hospitals and 90% of physician offices now use 
an EHR. While we see sporadic examples of 
activities under Steps 2, 3, and even 4, they are 
by far the exception. 

As we look beyond the present, let’s fantasize 
about a future in world in which we have substan- 
tially accomplished all four steps. What might our 


healthcare system look like? The answer is that 
the experience of being both patient and health- 
care professional will be far more satisfying. 

Let’s turn first to the hospital. Much of the 
care that we currently think of as requiring hos- 
pitalization will undoubtedly be accomplished 
within less expensive settings (including the 
patient’s home), aided by a variety of technolo- 
gies ranging from clinical sensors to advanced 
audio and video capabilities. The hospital will 
mostly exist to care for very sick patients — the 
types we might today associate with being in 
the ICU. And the ICU will likely no longer be a 
walled off physical space. Rather, every hospi- 
tal bed will be modular, capable of supporting 
ICU level care with the push of a few buttons. 

Decision-making about who needs higher 
levels of care will not be left to the clinician’s 
“eyeball test.” Instead, clinicians’ experience 
will be augmented by sophisticated AI-based 
prediction tools constantly humming in the 
background, alerting doctors and nurses that, 
say, a patient’s probability of death just spiked 
up and thus she bears closer watching. Of 
course, taking advantage of all these AI- 
generated predictions will require cracking the 
tough nut of alert fatigue. This will be accom- 
plished by markedly decreasing false positive 
rates, implementing advanced data visualiza- 
tion and other prioritization methods, and like- 
ly through the discovery of approaches that 
haven’t yet been invented. 

Patient rooms will have large video screens 
and sophisticated camera and audio equipment 
to allow for tele-visits. Patients and families will 
be able to review clinicians’ notes, test results, 
and treatment recommendations, either on the 
big screen or on their hospital-issued tablet 
computer. Patients will not only have full access 
to their EHR but will also receive educational 
materials (“here’s what to expect from your 
MRI tonight”) and motivation (“Good job on 
your incentive spirometer today!”) — provided 
by the technology. The dreaded nurse call but- 
ton will be replaced by a voice-activated system 
in which a patient’s request results in a nurse 
appearing on screen and even taking some 
actions (increasing the IV flow rate or adjusting 
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the bed, for example) remotely. If a new pill is 
needed, likely as not a robot will deliver it. 

When the hospital doctor comes to visit the 
patient, the room’s telemedicine capabilities 
will allow additional parties to participate. For 
example, a palliative care discussion can involve 
distant family members, the inpatient palliative 
care team, and a physician at an outside hos- 
pice. An infectious disease consult might 
involve a discussion between the patient, the 
hospitalist, and the ID consultant in real time, 
rather than the serial visits and imperfect com- 
munication through chart notes that marks 
current practice. 

Speaking of notes, in both inpatient and 
outpatient settings physicians will no longer 
spend hours typing notes into the EHR. Rather, 
natural language processing technology will 
“listen” to the doctor-patient conversation via 
room-based microphones and create a useful 
note, improving itself over time as it learns each 
physicians’ individual practice style and patient 
population (“digital scribes”). Documentation 
will increasingly become the byproduct of the 
doctor-patient encounter, not a central focus 
on the physician’s attention. 

On the other hand, clinicians will glean far 
more useful information and insights from 
their digital tools, including the EHR. As data 
are entered into the patient’s chart, the EHR 
will suggest possible diagnoses and testing 
approaches, and guidelines and recommended 
treatment approaches will be a click or a voice 
command away. In essence, the EHR and the 
electronic textbook will merge into one inte- 
grated tool. 

Turning to the outpatient arena, much of 
the care that currently requires in-person visits 
will be conducted via IT-enabled home care 
and televisits. The care of patients with chronic 
diseases will be utterly transformed, with a far 
greater emphasis on real-time, home-based, 
technology-enabled decision support and dis- 
ease management. The heart failure patient will 
begin his day by weighing himself on a digital 
scale and answering a few questions on the 
computer (“How is your breathing? How did 
you sleep?”). It might even know how much salt 
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he used in the past day (through the “Internet 
of Things”). The technology will integrate this 
information, along with streaming data on 
heart rate and blood pressure drawn from wear- 
able or stick-on sensors, to offer recommenda- 
tions for drug and activity adjustments. Ditto 
for patients with diabetes, emphysema, asthma, 
and the like. 

Making all of this function will require a 
new workflow, and with it a new set of health- 
care professionals. Sometimes called “care traffic 
controllers,” they will be clinicians (likely nurses 
with advanced training in population health 
and some informatics) who will monitor, via 
advanced digital dashboards, the status of 100, 
or 1000 such patients, contacting and coaching 
the ones who seem to be having problems. The 
initial contacts may be generated by Al-driven 
algorithms and delivered by the technology, but 
the care traffic controller will intervene if the 
patient continues to have problems. For patients 
continuing to do poorly, a physician will become 
engaged. Even then, many of these encounters 
will be IT-enabled remote ones. 

For patients with acute medical issues, 
much of the care will be delivered by apps, 
which will also offer AI-derived recommenda- 
tions for simple diagnoses and interventions. 
Patients who require higher levels of care will 
see a clinician through telemedicine or commu- 
nity-based urgent care. Urgent care clinics will 
be conveniently placed in supermarkets and 
pharmacies, and our eventual success in achiev- 
ing complete interoperability — the patient’s 
record always available via the cloud — will 
enhance the ability to view the relevant parts of 
the EHR and to record data that then becomes 
available to all subsequent practitioners. 

The promise of precision medicine will 
finally be realized. For example, the guidelines 
for treating a 50-year old woman with high 
blood pressure or elevated cholesterol will 
become far more complex and customized, 
considering a variety of patient- and popula- 
tion-based risk factors and large amounts of 
genetic information. This same complexity also 
means that the clinician will depend on the 
computer to “know” all of these variables and 
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suggest the best approach. Rather than remem- 
bering the correct approach to hypercholester- 
emia in middle-aged women (there will no 
longer be any one correct approach), the role of 
the clinician will become more about interpret- 
ing the computer’s output (including interven- 
ing when it seems wrong), communicating the 
findings to the patient, and motivating the nec- 
essary behavioral change. Of course, this 
changed role will require a significant evolution 
in medical education. 

In fact, the ability to analyze vast amounts 
of digital data will transform all clinical 
research. Rather than basing most of our treat- 
ment recommendations on small randomized 
clinical trials, many advances will come through 
analyses of actual clinical data, seeing which 
approaches are associated with better out- 
comes. Of course, this will require sophisticated 
adjustment for confounders, which should also 
be facilitated by the vast amounts of fully inte- 
grated digital clinical information. Individual 
healthcare systems will take advantage of these 
data as well, transforming themselves into so- 
called “learning healthcare systems” by mining 
their own data and experience to determine 
which approaches lead to the best outcomes. 

The vision that I’ve described here is not 
around the corner — it is likely 10-15 years 
away. And achieving it will require not just the 
technology of a few large EHR vendors, but 


also the contribution of companies, large and 
small, some built specifically to solve specific 
healthcare problems, others digital giants (the 
Apples, Googles, and Amazons of the world) 
taking advantage of their capabilities in areas 
like app development, supply chain, and AI to 
attack healthcare problems. 

Importantly, in such a multidimensional 
digital world, success cannot come simply by 
buying pieces of technology, peeling off the 
bubble wrap, and dropping them into health- 
care systems and workflows. It will be up to 
clinical informaticists to deeply understand the 
needs of patients, clinicians, families, and 
administrators, the complexities of the tech- 
nologies, and the economic, regulatory, privacy, 
and often ethical context. Informatics profes- 
sionals will be the ones making the clinical and 
business case for change, and working with 
both vendors and clinicians to ensure that these 
new approaches and technologies actually 
achieve their aims. 

This is why the job of the clinician infor- 
maticist will remain highly secure for the fore- 
seeable future. While the job of the informaticist 
will no longer be to implement a core enterprise 
EHR, he or she will be doing something more 
complex and likely more valuable: reimagining 
the work and the workflow to take advantage 
of evolving digital capabilities to improve 
healthcare value. 
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Box 30.7 The Future of Biomedical 
Informatics from the Federal Govern- 
ment Perspective 

Patricia Flatley Brennan 


Advances in biomedical informatics, including 
computational bioinformatics, are essential to 
accelerating scientific discovery and assuring 
the health of society. Finally, after 40 years of 
promise, there is sufficient data and computing 
power to realize the visions of early biomedical 
informatics leaders that data-powered health 
could become a reality. Decades of slow but 
steady progress towards formalizing biomedi- 
cal knowledge through effective use of lan- 
guage and messaging standards is now 
complemented by improvement in heuristics 
and algorithms that can translate those formal- 
izations into actionable decisions. The atten- 
tion of the field to key users has broadened to 
include basic science researchers and clinicians 
as well as patients and families. 

As a major provider of health care services, 
as well as a key funder of health care services, 
supporter of biomedical and health related 
research, and guardian of key health quality 
initiatives, the United States federal govern- 
ment plays and will continue to play a signifi- 
cant role in advancing biomedical informatics 
over the next decade. Federal investment will 
lead to advances in data management and pro- 
tection, new ways to draw knowledge out of 
health data, and delivery of better, accurate 
and complete health information at the point 
of need, anywhere. Perspectives of open sci- 
ence, ensuring economic advancement through 
research, and a recognition of the accountabil- 
ity of the government to the taxpayer are 
engendering a new commitment of openness 
and responsiveness to society. 

The National Library of Medicine (NLM), 
one of the 27 institutes and centers at the 
National Institutes of Health, is key among the 
several federal agencies committed to ensuring 
the availability of high-quality data to charac- 
terize patient problems, account for health care 
resource expenditure and foster research driven 
by greater understanding of clinical phenome- 
na. The NLM partners with other health relat- 
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ed divisions and agencies, including the Center 
for Disease Control and Prevention, the Center 
for Medicare and Medicaid, and the Agency 
for Health Care Research and Quality to rap- 
idly respond to public health threats, monitor 
health care expenditures and quality and foster 
systemic interoperability. Partnerships between 
the NIH and with other federal agencies out- 
side of the health sector will allow the invest- 
ments in biomedical informatics to benefit from 
generalized investment in data curation, large 
scale data management and storage, privacy 
and network platforms. 

The NLM will do for data what it has done 
for the literature — making them findable, acces- 
sible, interoperable and re-usable (FAIR). 
These attributes, linked under the rubric of the 
FAIR principles, provide guidance for how a 
federal library makes its resources available to 
the public. Making data FAIR requires 
improved curation strategies, ones that balance 
automated approaches with human indexing 
and metadata developments in a way that takes 
advantage of the speed of automation while 
preserving human talent for the most-complex 
cases. The Library-of-the-Future will continue 
to see the NLM serving as the custodian of key 
collections, but also increasing its reach as a 
connector of important information and data 
resources that exist outside of its boundaries. 
Future developments may also lead to a discov- 
ery-on-demand approach to locating and 
obtaining information that has not be previ- 
ously archived. The NLM will invest in research 
that advances use of these important collec- 
tions and provides novel methodologies to 
interrogate them. 

Some agencies within the Department of 
Health and Human Services, such as the Office 
of the National Coordinator of Health 
Information Technology (ONC), will continue 
to invest in broad, societal resources to maintain 
the health information infrastructure. Other 
agencies, such as the Center for Medicare and 
Medicaid Services, will continue to be both data 
consumers (payment schemes resting on claims 
for health services) as well as data contributors, 
making their information accessible to consum- 
ers for enhanced self-monitoring and to 
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researchers to foster discovery informed by care. 
Several trans-federal initiatives are on the hori- 
zon, designed to ensure efficient investment in 
scalable, re-suable information resources. 

The recognition across NIH of the impor- 
tance linking clinical information and biological 
data portends expanded investment in the meth- 
ods for curating and integrating information 
across time within a person and across people to 
better understand the health of individuals and 
populations. Rapid growth of data from research 
taxes existing technical capabilities and demands 
additional policy development and financial 
investment to house important data resources. 
The federal government fosters policies that pro- 
tect patient privacy and develop the incentive 
structures to accelerate the adoption of effective 
computer systems for health care. With the rap- 
idly growing data generating initiatives, the fed- 
eral government must take a critical role in 
determining how to best select and preserve the 
full range of information. The federal govern- 
ment will host public discussion and dialogs that 
ensure the clinical information is sufficiently 
broad to reflect the clinical experience of all per- 
sons. It is responsible for ensuring the cross- 
national arrangements needed to keep scientific 
exchange of health data open and free-flowing. 

Interagency coordination is needed to 
ensure that technological advances benefit 
health care and that health dollars leverage 
investment made in other sectors. The primary 
point of coordination is through the 
Networking and Information Technology 
Research and Development (NITRD) 
Program. NITRD is a trans-agency initiative 
designed to provide the research and develop- 
ment foundations for advancing information 
technologies, and also to deploy those tech- 
nologies in the service of the country. The NIH 
reports its technological research and develop- 
ment expenditures to the President through the 
NITRD program. The NIH broadly, and the 
NLM specifically, participate in the many 
workgroups that focus on broad ranging topics 
such as computing-enabled human interac- 
tion, communication and augmentation, 
cybersecurity and privacy, and high capability 
computing infrastructure and applications. 


Federal resources should be spent on those 
things that only the federal government should 
do. These investments include short and long- 
term research and development programs that 
advance the health and well-being of society, edu- 
cating the workforce of the future, and protecting 
key assets in perpetuity. Most of this investment is 
likely to occur through the NLM. In the biomedi- 
cal informatics arena this means investing in 
research to develop method that are scalable, sus- 
tainable and reproducible, creating computation- 
al approaches to data management capable of 
curation at scale, developing the libraries of the 
future that not only encompass literature and 
data but also the interim product of research such 
as protocols, ethics and human subjects agree- 
ment, as well as novel methods of documenting 
research activities, such as the next generation of 
Jupyter notebooks. Development efforts should 
be applied to the ever-growing amount of text- 
based journal articles and reports, to devise new 
and creative ways to expose the literature to a 
variety of publics. Educational programs and 
efforts of the future will infuse data science and 
advanced biomedical informatics lessons not only 
in the training programs of specialists, but across 
the biomedical research and clinical training pro- 
grams, and even extending into equipping patients 
and lay people with access to data and informa- 
tion and tools to make use of those resources. 

It’s worthy to note two very important 
trends that will shape the future of the federal 
engagement with health information technolo- 
gies. First, there will be an increase in public 
private partnerships to leverage knowledge in 
the technical and information technology sec- 
tor in support of health care. Such partnership 
should lead to a more robust and interoperative 
health information environment. Second, there 
will be certain roles that the federal government 
must preserve, such as maintaining accurate 
and freely-accessible information resources for 
the public good and overseeing the develop- 
ment of policies that foster data sharing while 
protecting individual and institutional rights. 

Future federal efforts will be accompanied by 
collaborations with industry. These collabora- 
tions could take the form of joint investments in 
common problems, such as data quality or cura- 
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tion. Other forms of partnership may emerge that 
engage the federal investment for research and 
development with accelerated pathways for tech- 
nology transfer. Including industrial members on 
Federal Advisory Committees will provide path- 
ways for exchange of information. 

The NLM will continue to play a leadership 
role in maintaining accurate and freely-accessi- 
ble information resources. The NLM has taken 
a major step towards this by migrating all of 
our public facing information resources onto a 
common, sustainable technical platform. This 
migration will not only enhance efficiencies but 
also allow for increased interoperability across 
our resources. A common technical platform, 
coupled with enhancement of terminology and 
vocabulary systems, will make it more feasible 
for intended users to traverse the information 
resources housed here. 

In the future there will be an increasing 
role of the federal government in protecting 
and preserving information in perpetuity. 
The enabling legislation of the NLM directs 
it to collect the medical knowledge of the 
time and store it permanently in ways that 
make it accessible for a wide range of users. 
As the largest funder of public health and 
heath care, the federal government indirectly 
shapes what constitutes health information 
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and how it is used and valued. The federal 
government has two key levers for expanding 
the definition of what constitutes health: 
investing in research to demonstrate the con- 
sideration of health data, including social 
and behavioral predictors, on the impact of 
what constitutes health is a major contribu- 
tion. Additionally, because of its role as a 
major funder of health care through the 
CMS, the federal government shapes what is 
considered of value in health care, such as 
research that finds ways to incorporate the 
social and behavioral predictors of health 
into routine data collection, and then to 
ensure the use of this information in the diag- 
nostic, treatment and evaluation aspects of 
the health care process. 

The future of biomedical informatics from 
the federal perspective is one characterized by 
openness, partnerships and perpetual storage 
of biomedical knowledge. A vibrant research 
program will be needed to develop and deploy 
the tools needed to accomplish this vision. 
Thoughtful deliberation is essential to protect 
the privacy rights of individuals while fostering 
the greatest degree of sharing of data and 
information needed to achieve the goals 
enabled by data driven discovery. 
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(e) Suggested Readings 

Cimino, J. J. (2019). Putting the “why” in “EHR”: 
Capturing and coding clinical cognition. 
Journal of the American Medical Informatics 
Association, 26(11), 1379-1384. Cimino iden- 
tifies fundamental changes that will be needed 
to correct the common criticisms of today’s 
electronic health records to transform them 
from glorified billing diaries into true elec- 
tronic assistants. 

Mesko, B. The Medical Futurist. https:// 
medicalfuturist.com/magazine (accessed June 
12, 2020). Mesko’s online magazine (and other 
postings on the Futurist’s web site) provides a 
glimpse of technologies that are currently 
emerging or envisioned for the future, in many 
cases leveraging innovations in biomedical 
engineering or biomedical informatics. 

Topol, E. (2016). The patient will see you now: 
The future of medicine is in your hands. 
New York: Basic Books. Topol envisions the 
future world that follows today’s “Guttenberg 
moment.” Much as the printing press took 
learning out of the hands of a special class 
that had access to manuscripts, the Internet 
and modern computing devices are doing the 
same for medicine, giving individuals control 
over their own health care. 

Wachter, R. (2017). The digital doctor: Hope, 
hype, and harm at the dawn of medicine’s 
computer age. New York: McGraw-Hill 
Education. Offers a thoughtful critique of 
today’s modern application of digital technol- 
ogies in health care, identifying today’s limita- 
tions but emphasizing the promise for a greatly 
enhanced world for both patients and physi- 
cians. 


Q Questions for Discussion 
1. How are the advances in bioinformat- 
ics likely to affect clinical care and 
vice versa? 
2. Identify one potential setting for an 
informatics “living laboratory”. Who or 


what is the subject of evaluation? How 
would you “instrument” the setting to 
measure activity and performance? 

3. Identify one area for informatics educa- 
tion and describe the living laboratory 
that would support training objectives. 
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Glossary 


21st Century Cures Act A comprehensive bill 
that promotes and funds the acceleration of 
research into preventing and curing serious 
illnesses; accelerates drug and medical device 
development; attempts to address the opioid 
abuse crisis; and tries to improve mental 
health service delivery. It also includes a 
health IT-related provisions on interoperabil- 
ity, data sharing/exchange and electronic 
health records. 


Abductive reasoning Can be characterized as 
a cyclical process of generating possible expla- 
nations or a set of hypotheses that are able to 
account for the available data and then each 
of these hypotheses is evaluated on the basis 
of its potential consequences. In this regard, 
abductive reasoning is a data-driven process 
that relies heavily on the domain expertise of 
the person. 


Accountability Security function that ensures 
users are responsible for their access to and 
use of information based on a documented 
need and right to know. 


Accountable care A descendant of man- 
aged care, accountable care is an approach 
to improving care and reducing costs. See: 
Accountable Care Organizations. 


Accountable Care Organizations (ACOs) An 
organization of health care providers that 
agrees to be accountable for the quality, cost, 
and overall care of their patients. An ACO 
will be reimbursed on the basis of managing 
the care of a population of patients and are 
determined by quality scores and reductions 
in total costs of care. 


ACO See: Accountable Care Organizations. 
Active failures Errors that occur in an acute 
situation, the effects of which are immediately 


felt. 


Active phase The phase of a clinical research 
study during which investigators collect data 


from participants receiving an intervention 
or interventions under study. It is also com- 
mon to monitor study participants for adverse 
events during this phase. 


Active storage In a hierarchical data-storage 
scheme, the devices used to store data that 
have long-term validity and that must be 
accessed rapidly. 


Acute Physiology and Chronic Health 
Evaluation, Version III [APACHE III] A scoring 
system for rating the disease severity for par- 
ticular use in intensive care units. 


Adaptive learning Adapting the presenta- 
tion of learning content in response to 
continuous assessment of the learner’s per- 
formance. 


Address An indicator of location; typically 
a number that refers to a specific position in 
a computer’s memory or storage device; see 
also: Internet Address. 


ADE See: Adverse Drug Events. 


Admission-discharge-transfer (ADT) The 
core component of a hospital information 
system that maintains and updates the hos- 
pital census, including bed assignments of 
patients. 


ADT See: Admission-discharge-transfer. 


Advanced Cardiac Life Support A course to 
train providers on the procedure and set of 
clinical interventions for urgent treatment of 
cardiovascular emergencies. 


Advanced Research Projects Agency Network 
(ARPANET) A large wide-area network cre- 
ated in the 1960s by the U.S. Department 
of Defense Advanced Research Projects 
Agency (DARPA) for the free exchange of 
information among universities and research 
organizations; the precursor to today’s 
Internet. 
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Advanced Trauma Life Support A train- 
ing program for medical providers for the 
management of acute trauma cases. ATLS 
is developed by the American College of 
Surgeons. 


Adverse drug events (ADEs) Undesired 
patient events, whether expected or unex- 
pected, that are attributed to administration 
of a drug. 


Aggregations In the context of information 
retrieval, collections of content from a variety 
of content types, including bibliographic, full- 
text, and annotated material. 


AHIMA See: American Health Information 
Management Association. 


Alert message A computer-generated warn- 
ing that is generated when a record meets 
pre- specified criteria, often referring to a 
potentially dangerous situation that may 
require action; e.g., receipt of a new laboratory 
test result with an abnormal value. 


Algorithmic process An algorithm is a well- 
defined procedure or sequence of steps for 
solving a problem. A process that follows 
prescribed steps is accordingly an algorithmic 
process. 


Alphanumeric Descriptor of data that are 
represented as a string of letters and numeric 
digits, without spaces or punctuation. 


Amazon Mechanical Turk Amazon’s crowd- 
sourcing website for businesses or researchers 
(known as Requesters) that allows hiring of 
remotely located “crowdworkers” to perform 
discrete on-demand tasks that computers are 
currently unable to do. 


Ambulatory medical record system (AMRS) A 
clinical information system designed to 
support all information requirements of 
an outpatient clinic, including registration, 
appointment scheduling, billing, order entry, 
results reporting, and clinical documenta- 
tion. 
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American Health Information Management 
Association (AHIMA) Professional association 
devoted to the discipline of health informa- 
tion management (HIM). 


American Heart Association A non-profit 
organization dedicated to improving heart 
health. 


American Immunization Registry Association 
(AIRA) is a membership organization that exists 
to promote the development and implementa- 
tion of immunization information systems (IIS) 
as an important tool in preventing and con- 
trolling vaccine-preventable diseases. > https:// 
www.immregistries.org/about-aira. 


American Medical Informatics Association 
(AMIA) Professional association dedicated to 
biomedical and health informatics. 


American National Standards Institute 
[ANSI] A private organization that oversees 
voluntary consensus standards. 


American Public Health Association 
(APHA) Represents a broad array of health 
professionals and others who care about the 
health of all people and all communities. It is 
the leading not-for-profit public health orga- 
nization in the U.S. and seeks to strengthens 
the impact of public health professionals 
and provides a science-based voice in policy 
debates. APHA seeks to advance prevention, 
reduce health disparities and promote well- 
ness. > http://www.apha.org/. 


American Recovery and Reinvestment Act of 
2009 Public Law 111-5, commonly referred 
to as the Stimulus or Recovery Act, this legis- 
lation was designed to create jobs quickly and 
to invest in the nation’s infrastructure, educa- 
tion and healthcare capabilities. 


American Standard Code for Information 
Interchange (ASCII) A 7-bit code for rep- 
resenting alphanumeric characters and other 
symbols. 


AMIA See: American Medical Informatics 
Association. 
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AMRS See: Ambulatory medical record sys- 
tems. 


Analog signal A signal that takes on a con- 
tinuous range of values. 


Analog-to-digital conversion (ADC) Conver- 
sion of sampled values from a continuous- 
valued signal to a discrete-valued digital 
representation. 


Anchoring and adjustment A heuristic used 
when estimating probability, in which a per- 
son first makes a rough approximation (the 
anchor), then adjusts this estimate to account 
for additional information. 


Annotated content In the context of informa- 
tion retrieval, content that has been annotated 
to describe its type, subject matter, and other 
attributes. 


Anonymize Applied to health data and infor- 
mation about a unique individual, the act of 
de-identifying or stripping away any and all 
data which could be used to identify that indi- 
vidual. 

Standards 


ANSI See: American National 


Institute. 


Antibiogram Pattern of sensitivity of a micro- 
organism to various antibiotics. 


APACHE III See Acute Physiology and Chronic 
Health Evaluation, Version III. 


Apache Open source Web server software 
that was significant in facilitating the initial 
growth of the World Wide Web. 


Applets Small computer programs that can 
be embedded in an HTML document and 
that will execute on the user’s computer when 
referenced. 


Application program A computer program 
that automates routine operations that store 
and organize data, perform analyses, facili- 


tate the integration and communication of 
information, perform bookkeeping functions, 
monitor patient status, aid in education. 


Application programming interface (API) A 
specification that enables distinct software 
modules or components to communicate with 
each other. 


Applications (applied) research Systematic 
investigation or experimentation with the 
goal of applying knowledge to achieve practi- 
cal ends. 


Apps Software applications, especially ones 
downloaded to mobile devices. 


Archival storage In a hierarchical data- 
storage scheme, the devices used to store data 
for long- term backup, documentary, or legal 
purposes. 


Arden Syntax for Medical Logic Module A 
coding scheme or language that provides a 
canonical means for writing rules that relate 
specific patient situations to appropriate 
actions for practitioners to follow. The Arden 
Syntax standard is maintained by HL7. 


Argument A word or phrase that helps com- 
plete the meaning of a predicate. 


ARPANET See Advanced Research Projects 
Agency Network. 


Artificial intelligence (Al) The branch of com- 
puter science concerned with endowing com- 
puters with the ability to simulate intelligent 
human behavior. 


Artificial neural network A computer pro- 
gram that performs classification by taking as 
input a set of findings that describe a given 
situation, propagating calculated weights 
through a network of several layers of inter- 
connected nodes, and generating as output 
a set of numbers, where each output corre- 
sponds to the likelihood of a particular clas- 
sification that could explain the findings. 
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ASCII See: American Standard Code for 


Information Interchange. 


Assembler A computer program that trans- 
lates assembly-language programs into 
machine-language instructions. 


Assembly language A low-level language for 
writing computer programs using symbolic 
names and addresses within the computer’s 
memory. 


Association of American Medical Colleges 
(AAMC) A non-profit organization that 
includes all US and Canadian medical col- 
leges and many teaching hospitals, and sup- 
ports them in their education and research 
mission. 


Asynchronous Transfer Mode (ATM) A net- 
work protocol designed for sending streams 
of small, fixed length cells of information over 
very high-speed, dedicated connections, often 
digital optical circuits. 


Audit trail A chronological record of all 
accesses and changes to data records, often 
used to promote accountability for use of, and 
access to, medical data. 


Augmented reality Imposition of synthetic 
three-dimensional and text information on 
top of a view of the real world seen through 
specialized glasses worn by the learner. 


Authenticated A process for positive and 
unique identification of users, implemented 
to control system access. 


Authorized Within a system, a process for 
limiting user activities only to actions defined 
as appropriate based on the user’s role. 


Automated indexing The most common 
method of full-text indexing; words in a 
document are stripped of common suffixes, 
entered as items in the index, then assigned 
weights based on their ability to discrimi- 
nate among documents (see vector-space 
model). 
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Availability In decision making, a heuristic 
method by which a person estimates the prob- 
ability of an event based on the ease with 
which he can recall similar events. In secu- 
rity systems, a function that ensures delivery 
of accurate and up-to-date information to 
authorized users when needed. 


Averaging out at chance nodes The process 
by which each chance node of a decision tree 
is replaced in the tree by the expected value of 
the event that it represents. 


Backbone links Sections of high-capacity 
trunk (backbone) network that interconnect 
regional and local networks. 


Backbone Network A high-speed commu- 
nication network that carries major traffic 
between smaller networks. 


Background question A question that asks 
for general information on a topic (see also: 
foreground question). 


Backward chaining Also known as goal- 
directed reasoning. A form of inference used 
in rule-based systems in which the inference 
engine determines whether the premise (left- 
hand side) of a given rule is true by invok- 
ing other rules that can conclude the values 
of variables that currently are unknown and 
that are referenced in the premise of the given 
rule. The process continues recursively until 
all rules that can supply the required values 
have been considered. 


Bag-of-words A language model where text is 
represented as a collection of words, indepen- 
dent of each other and disregarding word order. 


Bandwidth The capacity for information 
transmission; the number of bits that can be 
transmitted per unit of time. 


Baseline rate: population The prevalence of the 
condition under consideration in the population 
from which the subject was selected; individual: 
The frequency, rate, or degree of a condition 
before an intervention or other perturbation. 
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Basic Local Alignment Search Tool (BLAST) An 
algorithm for determining optimal genetic 
sequence alignments based on the obser- 
vations that sections of proteins are often 
conserved without gaps and that there are 
statistical analyses of the occurrence of small 
subsequences within larger sequences that 
can be used to prune the search for matching 
sequences in a large database. 


Basic research Systematic investigation or 
experimentation with the goal of discovering 
new knowledge, often by proposing new gen- 
eralizations from the results of several experi- 
ments. 


Basic science The enterprise of performing 
basic research. 


Bayes’theorem An algebraic expression often 
used in clinical diagnosis for calculating post- 
test probability of a condition (a disease, for 
example) if the pretest probability. (preva- 
lence) of the condition, as well as the sensitiv- 
ity and specificity of the test, are known (also 
called Bayes’ rule). Bayes’ theorem also has 
broad applicability in other areas of biomedi- 
cal informatics where probabilistic inference is 
pertinent, including the interpretation of data 
in bioinformatics. 


Bayesian diagnosis program A computer- 
based system that uses Bayes’ theorem to 
assist a user in developing and refining a dif- 
ferential diagnosis. 


Before-after study (aka Historically con- 
trolled study) A study in which the evaluator 
attempts to draw conclusions by comparing 
measures made during a baseline period prior 
to the information resource being available 
and measures made after it has been imple- 
mented. 


Behaviorism A social science framework for 
analyzing and modifying behavior. 


Belief network A diagrammatic representa- 
tion used to perform probabilistic inference; 
an influence diagram that has only chance 
nodes. 


Best of breed An information technology 
strategy that favors the selection of individual 
applications based on their specific function- 
ality rather than a single application that inte- 
grates a variety of functions. 


Best of cluster Best of cluster became a vari- 
ant of the “best of breed” strategy by selecting 
a single vendor for a group of similar depart- 
mental systems, such laboratory, pharmacy 
and radiology. 


Bibliographic content In information 
retrieval, information abstracted from the 
original source. 


Bibliographic database A collection of cita- 
tions or pointers to the published literature. 


Binary The condition of having only two val- 
ues or alternatives. 


Biobank A repository for biological materi- 
als that collects, processes, stores, and distrib- 
utes biospecimens (usually human) for use in 
research. 


Biocomputation The field encompassing the 
modeling and simulation of tissue, cell, and 
genetic behavior; see biomedical computing. 


Bioinformatics The study of how information 
is represented and transmitted in biological sys- 
tems, starting at the molecular level. 


Biomarker A characteristic that is objectively 
measured and evaluated as an indicator of 
normal biological processes, pathogenic pro- 
cesses, or pharmacologic responses to a thera- 
peutic intervention. 


Biomed Central An independent publishing 
house specializing in the publication of elec- 
tronic journals in biomedicine (see >» www. 
biomedcentral.com). 


Biomedical computing The use of computers 
in biology or medicine. 


Biomedical engineering An area of engineer- 
ing concerned primarily with the research and 
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development of biomedical instrumentation 
and biomedical devices. 


Biomedical informatics The interdisciplinary 
field that studies and pursues the effective uses 
of biomedical data, information, and knowl- 
edge for scientific inquiry, problem solving, 
and decision making, driven by efforts to 
improve human health. 


Biomedical Information Science and 
Technology Initiative (BISTI) An initiative 
launched by the NIH in 2000 to make opti- 
mal use of computer science, mathematics, 
and technology to address problems in biol- 
ogy and medicine. It includes a consortium 
of senior-level representatives from each of 
the NIH institutes and centers plus represen- 
tatives of other Federal agencies concerned 
with biocomputing. See: » http://www.bisti. 
nih. gov. 


Biomedical taxonomy A formal system for 
naming entities in biomedicine. 


Biomolecular imaging A discipline at the 
intersection of molecular biology and in vivo 
imaging, it enables the visualisation of cellular 
function and the follow-up of the molecular 
processes in living organisms without perturb- 
ing them. 


Biopsychosocial model A model of medi- 
cal care that emphasizes not only an under- 
standing of disease processes, but also the 
psychological and social conditions of the 
patient that affect both the disease and its 
therapy. 


Biosample Biological source material used in 
experimental assays. 


Biosurveillance A public health activity that 
monitors a population for occurrence of a 
rare disease of increased occurrence of a com- 
mon one. Also see Public Health Surveillance 
and Surveillance. 


Bit The logical atomic element for all digital 
computers. 
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Bit depth The number of bits that represent an 
individual pixel in an image; the more bits, the 
more intensities or colors can be represented. 


Bit rate The rate of information transfer; a 
function of the rate at which signals can be 
transmitted and the efficacy with which digi- 
tal information is encoded in the signal. 


BLAST See: Basic Local Alignment Search 
Tool. 


Blinding In the context of clinical research, 
blinding refers to the process of obfuscat- 
ing from the participant and/or investigator 
what study intervention a given participant is 
receiving. This is commonly done to reduce 
study biases. 


Blog A type of Web site that provides discus- 
sion or information on specific topics. 


Blue Button A feature of the Veteran 
Administration’s VistA system that exports 
an entire patient’s record in electronic form. 


BlueTooth A standard for the short-range 
wireless interconnection of mobile phones, 
computers, and other electronic devices. 


Body The portion of a simple electronic mail 
message that contains the free-text content of 
the message. 


Body of knowledge An information resource 
that encapsulates the knowledge of a field or 
discipline. 


Boolean operators The mathematical opera- 
tors and, or, and not, which are used to com- 
bine index terms in information retrieval 
searching. 


Boolean searching A search method in which 
search criteria are logically combined using 
and, or, and not operators. 


Bootstrap A small set of initial instruc- 
tions that is stored in read-only memory and 
executed each time a computer is turned on. 
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Execution of the bootstrap is called boot- 
ing the computer. By analogy, the process of 
starting larger computer systems. 


Bottom-up An algorithm for analyzing small 
pieces of a problem and building them up into 
larger components. 


Bound morpheme A morpheme that 
creates a different form of a word but 
must always occur with another morpheme 
(e.g., —ed, —s). 


B-pref A method for measuring retrieval per- 
formance in which documents without rel- 
evance judgments are excluded. 


Bridge A device that links or routes signals 
from one network to another. 


Broadband A data-transmission technique 
in which multiple signals may be transmit- 
ted simultaneously, each modulated within an 
assigned frequency range. 


Browsing Scanning a database, a list of files, 
or the Internet, either for a particular item or 
for anything that seems to be of interest. 


Bundled payments In the healthcare context, 
refers to the practice of reimbursing provid- 
ers based on the total expected costs of a par- 
ticular episode of care. Generally occupies a 
“middle ground” between fee-for-service and 
capitation mechanisms. 


Business logic layer A conceptual level of 
system architecture that insulates the appli- 
cations and processing components from the 
underlying data and the user interfaces that 
access the data. 


Buttons Graphic elements within a dialog 
box or user-selectable areas within an HTML 
document that, when activated, perform a 
specified function (such as invoking other 
HTML documents and services). 


C statistic The area under an receiver operat- 
ing characteristic (ROC) curve. 


CAD See: Computer-aided diagnosis. 


Cadaver An embalmed human body used for 
teaching anatomy through the process of dis- 
secting tissue. 


Canonical form A preferred string or name 
for a term or collection of names; the canoni- 
cal form may be determined by a set of rules 
(e.g., “all capital letters with words sorted in 
alphabetical order”) or may be simply chosen 
arbitrarily. 


Capitated payments System of health-care 
reimbursement in which providers are paid a 
fixed amount per patient to take care of all the 
health-needs of a population of patients. 


Capitation Payments to providers, typically 
on an annual basis, in return for which the 
clinicians provide all necessary care for the 
patient and do not submit additional fee-for- 
service bills. 


Cardiac output A measure of blood volume 
pumped out of the left or right ventricle of the 
heart, expressed as liters per minute. 


Care coordinator See: Case Manager. 


Care plan A document that provides direction 
for individualized patient care. 


Cascading finite state automata (FSA) A tag- 
ging method in natural language processing 
in which as series of finite state automata are 
employed such that the output of one FSA 
becomes the input for another. 


Case Refers to the capitalization of letters in 
a word. 


Case manager A person in charge of coordi- 
nating all aspects of a patient’s care. 


CCD See: Continuity of Care Document. 


CCOW See: Clinical Context 


Workgroup. 


Object 


CDC See Centers for Disease Control and 
Prevention. 


CDE See Common Data Element. 
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CDR See: Clinical data repository. 


CDS Hooks A technical approach designed to 
invoke external CDS services from within the 
EHR workflow based upon a triggering event. 
Services may be in the form of (a) information 
cards — provide text for the user to read; (b) 
suggestion cards — provide a specific sugges- 
tion for which the EHR renders a button that 
the user can click to accept, with subsequent 
population of the change into the EHR user 
interface; and (c) app link cards — provide a 
link to an app. 


CDSS See: Clinical decision-support system. 
CDW See: Clinical data warehouse. 


Cellular imaging Imaging methods that visu- 
alize cells. 


Center for Medicare & Medicaid Services The 
Center for Medicare & Medicaid Services 
(CMS) is a federal agency within the United 
States Department of Health and Human 
Services that administers the Medicare pro- 
gram and works in partnership with state 
governments to administer Medicaid, the 
Children’s Health Insurance Program, and 
health insurance portability standards. In 
addition to these programs, CMS has other 
responsibilities, including the administra- 
tive simplification standards from the Health 
Insurance Portability and Accountability Act 
of 1996 (HIPAA). 


Centering theory A theory that attempts to 
explain what entities are indicated by referen- 
tial expressions (such as pronouns) by noting 
how the center (focus of attention) of each 
sentence changes across the text. 


Centers for Disease Control and Prevention 
(CDC) An agency within the US Department 
of Health and Human Services that provides 
the public with health information and pro- 
motes health through partnerships with state 
health departments and other organizations. 


Central computer system A single system that 
handles all computer applications in an insti- 
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tution using a common set of databases and 
interfaces. 


Central processing unit (CPU) The “brain” of 
the computer. The CPU executes a program 
stored in main memory by fetching and exe- 
cuting instructions in the program. 


Central Test Node (CTN) DICOM software 
to foster cooperative demonstrations by the 
medical imaging vendors. 


Certificate Coded authorization information 
that can be verified by a certification author- 
ity to grant system access. 


Challenge evaluation An evaluation of infor- 
mation systems, often in the field of informa- 
tion retrieval or related areas, that provides a 
public test collection or gold standard data 
collection for various researchers to compare 
and analyze results. 


Chance node A symbol that represents a 
chance event. By convention, a chance node is 
indicated in a decision tree by a circle. 


Character sets and encodings Tables of 
numeric values that correspond to sets of 
printable or displayable characters. ASCII is 
one example of such an encoding. 


Chart parsing A dynamic programming algo- 
rithm for structuring a sentence according to 
grammar by saving and reusing segments of 
the sentence that have been parsed. 


Chat A synchronous mode of text-based 
communication. 


Check tags In MeSH, terms that represent 
certain facets of medical studies, such as age, 
gender, human or nonhuman, and type of 
grant support; check tags provide additional 
indexing of bibliographic citations in data- 
bases such as Medline. 


CHI See: Consumer health informatics. 


CHIN See: Community Health Information 
Network. 
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Chunking A natural language process- 
ing method for determining non-recursive 
phrases where each phrase corresponds to a 
specific part of speech. 


CINAHL (or CINHL) See: Cumulative Index to 
Nursing and Allied Health Literature. 


CINAHL Subject Headings A set of terms 
based on MeSH, with additional domain- 
specific terms added, used for indexing the 
Cumulative Index to Nursing and Allied 
Health Literature (CINAHL). 


CIS See: Clinical information system. 


Citation database A database of citations 
found in scientific articles, showing the linkages 
among articles in the scientific literature. 


Classification In image processing, the cat- 
egorization of segmented regions of an image 
based on the values of measured parameters, 
such as area and intensity. 


Classroom Technologies All technology used 
in a classroom setting including projection 
of two-dimensional slides or views of three- 
dimensional objects, electronic markup of a 
screen presentation, real time feedback sys- 
tems such as class polling, and digital record- 
ing of a class session. 


CLIA certification See: Clinical Laboratory 
Improvement Amendments of 1988 
Certification. 


Client-server Information processing inter- 
action that distributes application processing 
between a local computer (the client) and a 
remote computer resource (the server). 


Clinical and translational research A broad 
spectrum of research activities involving the 
translation of findings from initial labora- 
tory- based studies into early-stage clinical 
studies, and subsequently, from the findings 
of those studies in clinical and/or population- 
level practice. This broad area incorporates 


multiple biomedical informatics sub-domains, 
including both translational bioinformatics 
and clinical research informatics. 


Clinical Context Object Workgroup (CCOW) A 
common protocol for single sign-on imple- 
mentations in health care. It allows multiple 
applications to be linked together, so the end 
user only logs in and selects a patient in one 
application, and those actions propagate to 
the other applications. 


Clinical data repository (CDR) Clinical data- 
base optimized for storage and retrieval 
for individual patients and used to support 
patient care and daily operations. 


Clinical data warehouse (CDW) A database of 
clinical data obtained from primary sources 
such as electron health records, organized for 
re-use for secondary purposes. 


Clinical datum Replaces medical datum with 
same definition. 


Clinical decision support Any process that 
provides health-care workers and patients 
with situation-specific knowledge that can 
inform their decisions regarding health and 
health care. 


Clinical decision-support system (CDSS) A 
computer-based system that assists physicians 
in making decisions about patient care. 


Clinical Document Architecture An HL7 
standard for naming and structuring clinical 
documents, such as reports. 


Clinical expert system A computer program 
designed to provide decision support for 
diagnosis or therapy planning at a level of 
sophistication that an expert physician might 
provide. 


Clinical guidelines Systematically developed 
statements to assist practitioner and patient 
decisions about appropriate health care for 
specific clinical circumstances. 
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Clinical informatics The application of bio- 
medical informatics methods in the patient- 
care domain; a combination of computer 
science, information science, and clinical sci- 
ence designed to assist in the management 
and processing of clinical data, information, 
and knowledge to support clinical practice. 


Clinical information system (CIS) The com- 
ponents of a health-care information system 
designed to support the delivery of patient 
care, including order communications, results 
reporting, care planning, and clinical docu- 
mentation. 


Clinical judgment Decision making by cli- 
nicians that incorporates professional expe- 
rience and social, ethical, psychological, 
financial, and other factors in addition to the 
objective medical data. 


Clinical Laboratory Improvement Amend- 
ments of 1988 certification Clinical Labo- 
ratory Improvement Amendments of 1988, 
establishing laboratory testing quality stan- 
dards to ensure the accuracy, reliability and 
timeliness of patient test results, regardless of 
where the test was performed. 


Clinical modifications A published set of 
changes to the International Classification of 
Diseases (ICD) that provides additional levels 
of detail necessary for statistical reporting in 
the United States. 


Clinical pathway Disease-specific plan that 
identifies clinical goals, interventions, and 
expected outcomes by time period. 


Clinical Quality Language An expression lan- 
guage standardized by HL7 that is used to 
characterize both quality measure logic and 
decision-support logic. 


Clinical research The range of studies and tri- 
als in human subjects that fall into the three 
sub-categories: (1) Patient-oriented research: 
Research conducted with human subjects 
(or on material of human origin such as tis- 
sues, specimens and cognitive phenomena) for 
which an investigator (or colleague) directly 
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interacts with human subjects. Patient- 
oriented research includes: (a) mechanisms 
of human disease; (b) therapeutic interven- 
tions; (c) clinical trial; and (d) development 
of new technologies. (2) Epidemiologic and 
behavioral studies. (3) Outcomes research and 
health services research. 


Clinical research informatics (CRI) The appli- 
cation of biomedical informatics methods in 
the clinical research domain to support all 
aspects of clinical research, from hypothesis 
generation, through study design, study exe- 
cution and data collection, data analysis, and 
dissemination of results. 


Clinical Research Management System 
(CRMS) A clinical research management sys- 
tem is a technology platform that supports 
and enables the conduct of clinical research, 
including clinical trials, usually through a 
combination of functional modules target- 
ing the preparatory, enrollment, active, and 
dissemination phases of such research pro- 
grams. CRMS systems are often also referred 
to as Clinical Trials Management Systems 
(CTMS), particularly when they are used to 
manage only clinical trials rather than various 
types of clinical research. 


Clinical subgroup A subset of a population in 
which the members have similar characteris- 
tics and symptoms, and therefore similar like- 
lihood of disease. 


Clinical trials Research projects that involve 
the direct management of patients and are 
generally aimed at determining optimal modes 
of therapy, evaluation, or other interventions. 


Clinical-event monitors Systems that elec- 
tronically and automatically record the occur- 
rence or changes of specific clinical events, 
such as blood pressure, respiratory capability, 
or heart rhythms. 


Clinically relevant population The population 
of patients that is seen in actual practice. In 
the context of estimating the sensitivity and 
specificity of a diagnostic test, that group of 
patients in whom the test actually will be used. 
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Closed loop Regulation of a physiological 
variable, such as blood pressure, by monitor- 
ing the value of the variable and altering ther- 
apy without human intervention. 


Closed loop medication management sys- 
tem A workflow process (typically supported 
electronically) through which medications are 
ordered electronically by a physician, filled by 
the pharmacy, delivered to the patient, admin- 
istered by a nurse, and subsequently moni- 
tored for effectiveness by the physician. 


Cloud technology or computing Cloud com- 
puting is using computing resources located 
in a remote location. Typically, cloud com- 
puting is provided by a separate business, and 
the user pays for it on per usage basis. There 
are variations such as private clouds, where 
the “cloud” is provided by the same business, 
but leverages methods that permit easier vir- 
tualization and expandability than traditional 
methods. Private clouds are popular with 
healthcare because of security concerns with 
public cloud computing. 


Clustering algorithms A method which 
assigns a set of objects into groups (called 
clusters) so that the objects in the same cluster 
are more similar (in some sense or another) to 
each other than to those in other clusters. 


CMS See: Center for Medicare and Medicaid 
Services. 


Coaching system An intelligent tutoring sys- 
tem that monitors the session and intervenes 
only when the student requests help or makes 
serious mistakes. 


Cocke-Younger-Kasami (CYK) A dynamic 
programming method that uses bottom-up 
rules for parsing grammar-free text; used only 
in conjunction with a grammar written in 
Chomsky normal form. 


Code As a verb, to write a program. As a 
noun, the program itself. 


Cognitive artifacts human-made materials, 
devices, and systems that extend people’s 


abilities in perceiving objects, encoding and 
retrieving information from memory, and 
problem-solving. 


Cognitive engineering An interdisciplinary 
approach to the development of principles, 
methods and tools to assess and guide the 
design of computerized systems to support 
human performance. 


Cognitive heuristics Mental processes by 
which we learn, recall, or process information; 
rules of thumb. 


Cognitive Informatics (Cl) is an interdisciplin- 
ary field consisting of cognitive and informa- 
tion sciences, specifically focusing on human 
information processing, mechanisms and pro- 
cesses within the context of computing and 
computer applications. The focus of CI is on 
understanding work processes and activities 
within the context of human cognition and 
the design of interventional solutions (often 
engineering, computing and information 
technology solutions). 


Cognitive load An excess of information that 
competes for few cognitive resources, creating 
a burden on working memory. 


Cognitive science Area of research concerned 
with studying the processes by which people 
think and behave. 


Cognitive task analysis The analysis of both 
the information-processing demands of a task 
and the kinds of domain-specific knowledge 
required performing it, used to study human 
performance. 


Cognitive walkthrough (CW) An analytic 
method for characterizing the cognitive pro- 
cesses of users performing a task. The method 
is performed by an analyst or group of ana- 
lysts “walking through” the sequence of 
actions necessary to achieve a goal, thereby 
seeking to identify potential usability prob- 
lems that may impede the successful comple- 
tion of a task or introduce complexity in a 
way that may frustrate users. 
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Collaborative workspace A virtual environ- 
ment in which multiple participants can inter- 
act, synchronously or asynchronously, to 
perform a collaborative task. 


Color resolution A measure of the ability to 
distinguish among different colors (indicated 
in a digital image by the number of bits per 
pixel). Three sets of multiple bits are required 
to specify the intensity of red, green, and blue 
components of each pixel color. 


Commodity internet A general-purpose con- 
nection to the Internet, not configured for any 
particular purpose. 


Common Data Elements (CDEs) Standards for 
data that stipulate the methods by which the 
data are collected and the controlled termi- 
nologies used to represent them. Many stan- 
dard sets of CDEs have been developed, often 
overlapping in nature. 


Communication Data transmission and infor- 
mation exchange between computers using 
accepted protocols via an exchange medium 
such as a telephone line or fiber optic cable. 


Community Health Information Network 
(CHIN) A computer network developed for 
exchange of sharable health information 
among independent participant organizations 
in a geographic area (or community). 


Comparative effectiveness research A form 
of clinical research that compares examines out- 
comes of two or more interventions to determine 
if one is statistically superior to another. 


Compiler A program that translates a pro- 
gram written in a high-level programming 
language to a machine-language program, 
which can then be executed. 


Comprehensibility and control Security func- 
tion that ensures that data owners and data 
stewards have effective control over informa- 
tion confidentiality and access. 


Computational biology The science of com- 
puter-based mathematical and statistical tech- 
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niques to analyze biological systems. See also 
bioinformatics. 


Computed check A procedure applied to 
entered data that detects errors based on 
whether values have the correct mathematical 
relationship; (e.g., white blood cell differential 
counts, reported as percentages, must sum to 
100. 


Computed tomography (CT) An imag- 
ing modality in which X rays are projected 
through the body from multiple angles and 
the resultant absorption values are analyzed 
by a computer to produce cross-sectional 
slices. 


Computer architecture The basic structure of 
a computer, including memory organization, 
a scheme for encoding data and instructions, 
and control mechanisms for performing com- 
puting operations. 


Computer memories Store programs and 
data that are being used actively by a CPU. 


Computer program A set of instructions that 
tells acomputer which mathematical and logi- 
cal operations to perform. 

Computer Virtual 
patient. 


simulated patient See 


Computer-aided diagnosis (CAD) Any form 
of diagnosis in which a computer program 
helps suggest or rank diagnostic consider- 
ations. 


Computer-based (or computerized) physician 
order entry (CPOE) A clinical information sys- 
tem that allows physicians and other clinicians 
to record patient-specific orders for commu- 
nication to other patient care team members 
and to other information systems (such as 
test orders to laboratory systems or medica- 
tion orders to pharmacy systems). Sometimes 
called provider order entry or practitioner 
order entry to emphasize such systems’ uses 
by clinicians other than physicians. 
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Computer-based patient records (CPRs) An 
early name for electronic health records 
(EHRs) dating to the early 1990s. 


Concept A unit of thought made explicit 
through the representation of properties of an 
object or a set of common objects. An abstract 
idea generalized from specific instances of 
objects that occur in the world. 


Conceptual graph A formal notation in which 
knowledge is represented through explicit 
relationships between concepts. Graphs can 
be depicted with diagrams consisting of 
shapes and arrows, or in a text format. 
Conceptual about 
concepts. 


knowledge Knowledge 


Concordant test results Test results that 
reflect the true patient state (true-positive and 
true- negative results). 


Conditional probability The probability of 
an event, contingent on the occurrence of 
another event. 


Conditionally independent Two events, A 
and B, are conditionally independent if the 
occurrence of one does not influence the 
probability of the occurrence of the other, 
when both events are conditioned on a third 
event C. Thus, p[A | B,C] = p[A | C] and p[B 
| A,C] = p[B | C]. The conditional probability 
of two conditionally independent events both 
occur- ring is the product of the individual 
conditional probabilities: p[A,B | C] = p[A 
| C] x p[B | C]. For example, two tests for a 
disease are conditionally independent when 
the probability of the result of the second test 
does not depend on the result of the first test, 
given the disease state. For the case in which 
disease is present, p[second test positive | first 
test positive and disease present] = p[second 
test positive | first test negative and disease 
present] = p[second test positive | disease 
present]. More succinctly, the tests are con- 
ditionally independent if the sensitivity and 
specificity of one test do not depend on the 
result of the other test (See independent). 


Conditioned event A chance event, the prob- 
ability of which is affected by another chance 
event (the conditioning event). 


Conditioning event A chance event that 
affects the probability of occurrence of 
another chance event (the conditioned event). 


Confidentiality The ability of data owners 
and data stewards to control access to or 
release of private information. 


Consistency check A procedure applied to 
entered data that detects errors based on inter- 
nal inconsistencies; e.g., recognizing a problem 
with the recording of cancer of the prostate as 
the diagnosis for a female patient. 


Constructivism Argues that humans generate 
knowledge and meaning from an interaction 
between their experiences and their ideas. 


Constructivist Argues that humans generate 
knowledge and meaning from an interaction 
between their experiences and their ideas. 


Consumer health informatics(CHI) Appli- 
cations of medical informatics technologies 
that focus on patients or healthy individuals 
as the primary users. 


Content In information retrieval, media 
developed to communicate information or 
knowledge. 


Content based image retrieval Also known as 
query by image content (QBIC) and content- 
based visual information retrieval (CBVIR) 
is the application of computer vision tech- 
niques to the image retrieval problem, that is, 
the problem of searching for digital images in 
large databases. 


Context free grammar A mathematical 
model of a set of strings whose members are 
defined as capable of being generated from 
a starting symbol, using rules in which a 
single symbol is expanded into one or more 
symbols. 
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Contingency table A 2 x 2 table that shows 
the relative frequencies of true-positive, true- 
negative, false-positive, and false-negative 
results. 


Continuity of care The coordination of care 
received by a patient over time and across 
multiple healthcare providers. 


Continuity of Care Document (CCD) An HL7 
standard that enables specification of the patient 
data that relate to one or more encounters with 
the healthcare system. The CCD is used for 
interchange of patient information (e.g., within 
Health Information Exchanges). The format 
enables all the electronic information about a 
patient to be aggregated within a standardized 
data structure that can be parsed and inter- 
preted by a variety of information systems. 


Continuous glucose monitor (CGM) A device 
that automatically tracks a diabetic patient’s 
blood glucose levels throughout the day and 
night using a tiny sensor inserted under the 
skin. 


Continuum of care The full spectrum of 
health services provided to patients, including 
health maintenance, primary care, acute care, 
critical care, rehabilitation, home care, skilled 
nursing care, and hospice care. 


Contract-management system A computer 
system used to support managed-care con- 
tracting by estimating the costs and payments 
associated with potential contract terms and 
by comparing actual with expected payments 
based on contract terms. 


Contrast The difference in light intensity 
between dark and light areas of an image. 


Contrast resolution A metric for how well an 
imaging modality can distinguish small differ- 
ences in signal intensity in different regions of 
the image. 


Control intervention In the context of clini- 
cal research, a control intervention repre- 
sents the intervention (e.g. placebo, standard 
care, etc.) given to the group of study par- 
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ticipants assigned to the control or compara- 
tor arm of a study. Depending on the study 
type, the goal is to generate data as the basis 
of comparison with the experimental inter- 
vention of interest in order to determine the 
safety, efficacy, or benefits of an experimen- 
tal intervention. 


Controlled terminology A finite, enumerated 
set of terms intended to convey information 
unambiguously. 


Copyright law Protection of written materials 
and intellectual property from being copied 
verbatim. 


Coreference chains Provide a compact 
representation for encoding the words and 
phrases in a text that all refer to the same 
entity. 


Coreference resolution In natural language 
processing, the assignment of specific mean- 
ing to some indirect reference. 


Correctional Telehealth The application of 
telehealth to the care of prison inmates, where 
physical delivery of the patient to the practi- 
tioner is impractical. 


Covered entities Under the HIPAA Privacy 
Rule, a covered entity is an organization or 
individual that handles personal health infor- 
mation. Covered entities include providers, 
health plans, and clearinghouses. 


COVID-19 A disease that was identified in late 
2019 and was declared a global pandemic on 
March 11, 2020. COVID-19 became an inter- 
national public health emergency, affecting 
essentially all countries on the planet. It is 
characterized by contagion before symptoms, 
high rate of transmission between human 
beings, variable severity among affected indi- 
viduals, and relatively high mortality rate. 


CPOE See: Computer-based (or Compu- 
terized) Physician (or Provider) Order Entry. 


CPR (or CPRS) See: Computer-based patient 
records. 
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CPU See: Central processing unit. 
CRI See: Clinical research informatics. 
Clinical Research 


CRMS (or CRDMS) See: 
Management System. 


Cryptographic encoding Scheme for pro- 
tecting data through authentication and 
authorization controls based on use of keys 
for encrypting and decrypting information. 


CT (or CAT) See: Computed tomography. 


Cumulative Index to Nursing and Allied Health 
Literature (CINHL) A non-NLM bibliographic 
database the covers nursing and allied health 
literature, including physical therapy, occupa- 
tional therapy, laboratory technology, health 
education, physician assistants, and medical 
records. 


Curly Braces Problem The situation that 
arises in Arden Syntax where the code used to 
enumerate the variables required by a medi- 
cal logic module (MLM) cannot describe 
how the variables actually derive their values 
from data in the EHR database. Each variable 
definition in an MLM has {curly braces} that 
enclose words in natural language that indi- 
cate the meaning of the corresponding vari- 
able. The particular database query required 
to supply a value for the variable must be 
specified by the local implementer, however. 
The curly braces problem makes it impossible 
for an MLM developed at one institution to 
operate at another without local modification. 


Cursor A blinking region of a display moni- 
tor, or a symbol such as an arrow, that indi- 
cates the currently active position on the 
screen. 


Cybersecurity Measures that seeks to protect 
against the criminal or unauthorized use of 
electronic data. 


CYK See: Cocke-Younger-Kasami. 


Dashboard A user-interface element that dis- 
plays data produced by several computer pro- 


grams simultaneously and that allows users to 
interact with those programs in standardized 
ways. 


Data buses An electronic pathway for trans- 
ferring data—for instance, between a CPU 
and memory. 


Data capture The process of collecting data 
to be stored in an information system; it 
includes entry by a person using a keyboard 
and collection of data from sensors. 


Data Encryption Standard (DES) A widely- 
used method of for securing encryption that 
uses a private (secret) key for encryption and 
requires the same key for decryption (see also, 
public key cryptography). 


Data independence The insulation of appli- 
cations programs from changes in data- 
storage structures and data-access strategies. 


Datalayer A conceptual level of system archi- 
tecture that isolates the data collected and 
stored in the enterprise from the applications 
and user interfaces used to access those data. 


Data Recording The documentation of infor- 
mation for archival or future use through 
mechanisms such as handwritten text, draw- 
ings, machine-generated traces, or photo- 
graphic images. 


Data science The field of study that uses 
analytic, quantitative, and domain exper- 
tise for knowledge discovery, typically using 
“big data” which could be structured and/or 
unstructured. 


Database A collection of stored data—typi- 
cally organized into fields, records, and files— 
and an associated description (schema). 


Database management system (DBMS) An 
integrated set of programs that manages 
access to databases. 


Data-interchange standards Adopted for- 
mats and protocols for exchange of data 
between independent computer systems. 
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Datum Any single observation of fact. A 
medical datum generally can be regarded as 
the value of a specific parameter (for example, 
red-blood-cell count) for a particular object 
(for example, a patient) at a given point in 
time. 


DBMIS See: Database Management System. 
DCMI See: Dublin Core Metadata Initiative. 


Debugger A system program that provides 
traces, memory dumps, and other tools to 
assist programmers in locating and eliminat- 
ing errors in their programs. 


Decision analysis A methodology for mak- 
ing decisions by identifying alternatives and 
assessing them with regard to both the likeli- 
hood of possible outcomes and the costs and 
benefits of those outcomes. 


Decision node A symbol that represents a 
choice among actions. By convention, a deci- 
sion node is represented in a decision tree by 
a square. 


Decision support The process of assisting 
humans in making decisions, such as inter- 
preting clinical information or choosing a 
diagnostic or therapeutic action. See: Clinical 
Decision Support. 


Decision tree A diagrammatic representa- 
tion of the outcomes associated with chance 
events and voluntary actions. 


Deductive reasoning is a process of reach- 
ing specific conclusions (e.g., a diagnosis) 
from a hypothesis or a set of hypotheses. 
Deductive logic helps in building up the 
consequences of each hypothesis, and this 
kind of reasonning is customarily regarded 
as acommon way of evaluating diagnostic 
hypotheses. 


De-duplicate/Deduplication The process that 
matches, links, and or merges data to elimi- 
nate redundancies. 
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De-identified aggregate data Data reports 
that are summarized or altered slightly in a 
way that makes the discernment of the iden- 
tity of any of the individuals whose data was 
used for the report impossible or so difficult 
as to be extremely improbable. The process of 
de-identifying aggregate data is known as sta- 
tistical disclosure control. 


Delta check A procedure applied to entered 
data that detects large and unlikely differences 
between the values of a new result and of the 
previous observations; e.g., a recorded weight 
that changes by 100 Ib in 2 weeks. 


Demonstration study Study that establishes 
a relation—which may be associational or cau- 
sal—between a set of measured variables. 


Dental informatics The application of bio- 
medical informatics methods and techniques 
to problems derived from the field of dentistry. 
Viewed as a subarea of clinical informatics. 


Deoxyribonucleic acid (DNA) The genetic 
material that is the basis for heredity. DNA 
is a long polymer chemical made of four 
basic subunits. The sequence in which these 
subunits occur in the polymer distinguishes 
one DNA molecule from another and in turn 
directs a cell’s production of proteins and all 
other basic cellular processes. 


Department of Health and Human Services 
(DHSS) that provides the public with health 
information and promotes health through 
partnerships with state health departments 
and other organizations. It is the federal agency 
charged with protecting the health and safety 
of U.S. citizens, both at home and abroad. It 
also oversees the development and application 
of programs for disease prevention and con- 
trol, environmental health, and health promo- 
tion and education. » http://www.cdc.gov/. 


Departmental system A system that focus on 
a specific niche area in the healthcare setting, 
such as a laboratory, pharmacy, radiology 
department, etc. 
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Dependency grammar A linguistic theory of 
syntax that is based on dependency relations 
between words, where one word in the sentence 
is independent and other words are dependent 
on that word. Generally, the verb of a sentence 
is independent and other words are directly or 
indirectly dependent on the verb. 


Dependent variable (also called outcome 
variable) In a correlational or experimental 
study, the main variable of interest or out- 
come variable, which is thought to be affected 
by or associated with the independent vari- 
ables (q.v.). 


Derivational morphemes A morpheme that 
changes the meaning or part of the speech of 
a word (e.g., —ful as in painful, converting a 
noun to an adjective). 


DES See: Data Encryption Standard. 


Descriptive study One-group study that seeks 
to measure the value of a variable in a sample 
of subjects. Study with no independent vari- 
able. 


Design validation A study conducted to 
inform the design of an information resource, 
e.g., a user survey. 


DHHS See: Department of Health and 
Human Services. 


Diagnosis The process of analyzing avail- 
able data to determine the pathophysiologic 
explanation for a patient’s symptoms. 


Diagnosis-based reimbursement Payments 
to providers (typically hospitals) based on the 
diagnosis made by a physician at the time of 
admission. 


Diagnosis-related group (DRG) One of 
almost 500 categories based on major diag- 
nosis, length of stay, secondary diagnosis, 
surgical procedure, age, and types of ser- 
vices required. Used to determine the fixed 
payment per case that Medicare will reim- 
burse hospitals for providing care to elderly 
patients. 


Diagnostic decision-support system A com- 
puter- based system that assists physicians 
in rendering diagnoses; a subset of clinical 
decision-support systems. See clinical decision 
support system. 


Diagnostic process The activity of deciding 
which questions to ask, which tests to order, 
or which procedures to perform, and deter- 
mining the value of the results relative to 
associated risks or financial costs. 


DICOM See: Digital Image Communications 
in Medicine. 


Dictionary A set of terms representing the 
system of concepts of a particular subject 
field. 


Differential diagnosis The set of active 
hypotheses (possible diagnoses) that a physi- 
cian develops when determining the source of 
a patient’s problem. 


Digital computer A computer that processes 
discrete values based on the binary digit or 
bit. Essentially all modern computers are 
digital, but analog computers also existed in 
the past. 


Digital divide Term referring to disparity 
in economic access to technology between 
“haves” and “have-nots”. 


Digital image An image that is stored as a 
grid of numbers, where each picture element 
(pixel) in the grid represents the intensity, and 
possibly color, of a small area. 


Digital Image Communications in Medicine 
(DICOM) A standard for electronic exchang- 
ing digital health images, such as x-rays and 
CT scans. 


Digital library Organized collections of elec- 
tronic content, intended for specific commu- 
nities or domains. 


Digital object identifier (DOI) A system for 
providing unique identifiers for published 
digital objects, consisting of a prefix that is 
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assigned by the International DOI Foundation 
to the publishing entity and a suffix that is 
assigned and maintained by the entity. 


Digital radiography (DR) The process of pro- 
ducing X-ray images that are stored in digi- 
tal form in computer memory, rather than on 
film. 


Digital signal A signal that takes on discrete 
values from a specified range of values. 


Digital signal processing (DSP) An integrated 
circuit designed for high-speed data manipu- 
lation and used in audio communications, 
image manipulation, and other data acquisi- 
tion and control applications. 


Digital subscriber line (DSL) A digital tele- 
phone service that allows high-speed network 
communication using conventional (twisted 
pair) telephone wiring. 


Digital subtraction angiography (DSA) A 
radiologic technique for imaging blood ves- 
sels in which a digital image acquired before 
injection of contrast material is subtracted 
pixel by pixel from an image acquired after 
injection. The resulting image shows only the 
differences in the two images, highlighting 
those areas where the contrast material has 
accumulated. 


Direct entry The entry of data into a com- 
puter system by the individual who personally 
made the observations. 


Discharge Plan A plan that supports the 
transition of a patient from one care facility 
to home or another care facility and includes 
evaluation of the patient by qualified per- 
sonnel, discussion with the patient or his 
representative, planning for homecoming or 
transfer to another care facility, determining 
whether caregiver training or other support 
is needed, referrals to a home care agency 
and/or appropriate support organizations in 
the community, and arranging for follow-up 
appointments or tests. 
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Discourse Large portions of text forming a 
narrative, such as paragraphs and documents. 


Discrete event simulation model A modeling 
approach that assesses interactions between 
people, typically composed of patients 
that have attributes and that experience events. 


Discussion board An on-line environment for 
exchanging public messages among partici- 
pants. 


Discussion lists and messaging boards Online 
tools for asynchronous text conversation. 


Disease Any condition in an organism that is 
other than the healthy state. 


Dissemination phase During the dissemina- 
tion phase of a clinical research study, inves- 
tigators analyze and report upon the data 
generated during the active phase. 


Distributed cognition A view of cognition 
that considers groups, material artifacts, and 
cultures and that emphasizes the inherently 
social and collaborative nature of cognition. 


Distributed computer systems A collection 
of independent computers that share data, 
programs, and other resources. 


DNA See: Deoxyribonucleic Acid. 
DNS See: Domain name system. 


Document structure The organization of text 
into sections. 


DOI See: Digital object identifier. 


Domain name system (DNS) A hierarchical 
name-management system used to translate 
computer names to Internet protocol (IP) 
addresses. 


Doppler shift A perceived change in fre- 
quency of a signal as the signal source moves 
toward or away from a signal receiver. 
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Double blind A clinical study methodology in 
which neither the researchers nor the subjects 
know to which study group a subject has been 
assigned. 


Double-blinded study In the context of 
clinical research, a double blinded study is 
a study in which both the investigator and 
participant are blinded from the assignment 
of an intervention. In this scenario, a trusted 
third party must maintain records of such 
study arm assignments to inform later data 
analyses. 


Draft standard for trial use A proposal for 
a standard developed by HL7 that is suffi- 
ciently well defined that early adopters can 
use the specification in the development of 
HIT. Ultimately, the draft standard may be 
refined and put to a ballot for endorsement by 
the members of the organization, thus creat- 
ing an official standard. 


DRG See Diagnosis-Related Groups. 


Drug repurposing Identifying existing drugs 
that may be useful for indications other 
than those for which they were initially 
approved. 


Drug screening robots A scientific instrument 
that can perform assays with potential drugs 
in a highly parallel and high throughput man- 
ner. 


Drug-genome interaction A relationship 
between a drug and a gene in which the gene 
product affects the activity of the drug or 
the drug influences the transcription of the 
gene. 


DSA See: Digital subtraction angiography. 
DSL See: Digital subscriber line. 

DSP See: Digital signal processing. 

Dublin Core Metadata Initiative (DCMI) A 


standard metadata model for indexing pub- 
lished documents. 


Dynamic A simulation program that mod- 
els changes in patient state over time and in 
response to students’ therapeutic decisions. 


Dynamic programming A computationally 
intensive computer-science technique used, 
for example, to determine optimal sequence 
alignments in many computational biology 
applications. 


Dynamic transmission model A model that 
divides a population into compartments (for 
example, uninfected, infected, recovered, 
dead), and for transitions between compart- 
ments are governed by differential or differ- 
ence equations. 


Dynamical systems models Models that 
describe and predict the interactions over time 
between multiple components of a phenom- 
enon that are viewed as a system. Dynamical 
systems models are often used to construct 
“controllers,” algorithms that adjust function- 
ing of the system (an airplane, artificial pan- 
creas, etc.) to maximize a set of optimization 
criteria. 


Earley parsing A dynamic programming 
method for parsing context-free grammar. 


EBM See Evidence-Based Medicine. 


EBM database For Evidence-Based Medicine 
database, a highly organized collection of 
clinical evidence to support medical decisions 
based on the results of controlled clinical 
trials. 


Ecological momentary assessment (EMA) A 
range of methods for collecting ecologically- 
valid self-report by enabling study partici- 
pants and patients to report their experiences 
in real-time, in real-world settings, over time 
and across contexts. 


eCRF See: Electronic Case Report Form. 
EDC See: Electronic Data Capture. 


EDI See: Electronic data interchange. 
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EEG See: Electroencephalography. 
EHR See: Electronic health record. 
EIW See: Enterprise information warehouse. 


Electroencephalography (EEG) A method for 
measuring the electromagnetic fields generated 
by the electrical activity of the neurons using 
a large arrays of scalp sensors, the output of 
which are processed to localize the source of 
the electrical activity inside the brain. 


Electronic Case Report Form (eCRF) A com- 
putational representation of paper case report 
forms (CRFs) used to enable EDC. 


Electronic Data Capture (EDC) EDC is the 
process of capturing study-related data ele- 
ments via computational mechanisms. 


Electronic Data interchange (EDI) Electronic 
exchange of standard data transactions, such 
as claims submission and electronic funds 
transfer. 


Electronic Health Record (EHR) A repository 
of electronically maintained information 
about an individual’s lifetime health status 
and health care, stored such that it can serve 
the multiple legitimate users of the record. See 
also EMR and CPR. 


Electronic health record system An elec- 
tronic health record and the tools used 
to manage the information; also referred 
to as a computer-based patient-record 
system and often shortened to electronic 
health record. 


Electronic Medical Record (EMR) The elec- 
tronic record documenting a patient’s care in 
a provider organization such as a hospital or 
a physician’s office. Often used interchange- 
ably with Electronic Health Record (EHR), 
although EHRs refer more typically to an 
individual’s lifetime health status and care 
rather than the set of particular organiza- 
tionally- based experiences. 
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Electronic Medical Records and Genomics 
(eMERGE) network A network of academic 
institutions that is exploring the capabilities 
of EHRs for genomic discovery and imple- 
mentation. 


Electronic-long, paper-short (ELPS) A publi- 
cation method which provides on the Web site 
supplemental material that did not appear in 
the print version of the journal. 


ELPS See: Electronic-long, paper-short. 


EMBASE A commercial biomedical and phar- 
macological database from ExcerptaMedica, 
which provides information about medical 
and drug-related subjects. 


Emergent design Study where the design or 
plan of research can and does change as the 
study progresses. Characteristic of subjectiv- 
ist studies. 


Emotion detection A natural language tech- 
nique for determining the mental state of the 
author of a text document. 


EMPI See: Enterprise master patient index. 
Electronic Medical 


EMR (or EMRS) See: 
Record. 


EMTREE A hierarchically structured, con- 
trolled vocabulary used for subject indexing, 
used to index EMBASE. 


Enabling technology Any technology that 
improves organizational processes through 
its use rather than on its own. Computers, 
for example, are useless unless “enabled” by 
operation systems and applications or imple- 
mented in support of work flows that might 
not otherwise be possible. 


Encryption The process of transforming 
information such that its meaning is hidden, 
with the intent of keeping it secret, such that 
only those who know how to decrypt it can 
read it; see decryption. 
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Endophenotypes An observable characteris- 
tic that is tightly linked to underlying genetics 
and less dependent on environmental expo- 
sures or chance. 


Enrichment analysis A statistical method to 
determine whether an a priori defined set of 
concepts shows statistically significant over- 
representation in descriptions of a set of items 
(such as genes) compared to what one would 
expect based on their frequency in a reference 
distribution. 


Enrollment during enrollment of a clinical 
research study, potential participants are iden- 
tified and research staff determine their eligi- 
bility for participation in a study, based upon 
the eligibility criteria described in the study 
protocol. If a participant is deemed eligible 
to participate, there are then officially “regis- 
tered” for the study. It is during this phase that 
in some trial designs, a process of randomiza- 
tion and assignment to study arms occurs. 


Enterprise information warehouse (EIW) A 
data base in which data from clinical, finan- 
cial and other operational sources are col- 
lected in order to be compared and contrasted 
across the enterprise. 


Enterprise master patient index (EMPI) An 
architectural component that serves as the 
name authority in a health-care informa- 
tion system composed of multiple indepen- 
dent systems; the EMPI provides an index 
of patient names and identification num- 
bers used by the connected information 
systems. 


Entrez A search engine from the National 
Center for Biotechnology Information 
(NCBI), at the National Library of 
Medicine; Entrez can be used to search a 
variety of life sciences databases, including 
PubMed. 


Entry term A synonym form for a subject 
heading in the Medical Subject Headings 
(MeSH) controlled, hierarchical vocabulary. 


Epidemiologic Related to the field of epide- 
miology. 


Epidemiology The study of the patterns, 
causes, and effects of health and disease con- 
ditions in defined populations. 


Epigenetics Heritable phenotypes that are 
not encoded in DNA sequence. 


Epigenomics The study of heritable pheno- 
types that are not encoded in the organisms 
DNA. 


e-prescribing The electronic process of gen- 
erating, transmitting and filling a medical pre- 
scription. 


Error analysis In natural language processing, 
a process for determining the reasons for 
false-positive and false-negative errors. 


Escrow Use of a trusted third party to hold 
cryptographic keys, computer source code, or 
other valuable information to protect against 
loss or inappropriate access. 


Ethernet A network standard that uses a bus 
or star topology and regulates communica- 
tion traffic using the Carrier Sense Multiple 
Access with Collision Detection (CSMA/CD) 
approach. 


Ethnography Set of research methodologies 
derived primarily from social anthropology. 
The basis of much of the subjectivist, qualita- 
tive evaluation approaches. 


ETL See: Extract, Transform, and Load. 


Evaluation contract A document describing 
the aims of a study, the methods to be used 
and resources made available, usually agreed 
between the evaluator and key stakeholders 
before the study begins. 


Event-Condition-Action (ECA) rule A rule that 
requires some event (such as the availability 
of a new data value in the database) to cause 
the condition (premise, or left-hand side) of 
the rule to be evaluated. If the condition is 
determined to be true, then some action is 
performed. Such rules are commonly found in 
active database systems and form the basis of 
medical logic modules. 
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Evidence-based guidelines(EBM) An appro- 
ach to medical practice whereby the best pos- 
sible evidence from the medical literature is 
incorporated in decision making. Generally 
such evidence is derived from controlled clini- 
cal trials. 


Exabyte 10!8 bytes. 


Exome The entire sequence of all genes 
within a genome, approximately 1-3% of the 
entire genome. 


Expected value The value that is expected on 
average for a specified chance event or deci- 
sion. 


Experimental intervention In the context of 
clinical research, an experimental interven- 
tion represents the treatment or other inter- 
vention delivered to a participant assigned to 
the experimental arm of the study in order to 
determine the safety, efficacy, or benefits of 
that intervention. 


Experimental science Systematic study 
characterized by posing hypotheses, design- 
ing experiments, performing analyses, and 
interpreting results to validate or disprove 
hypotheses and to suggest new hypotheses 
for study. 


Extensible markup language (XML) A sub- 
set of the Standard Generalized Markup 
Language (SGML) from the World Wide 
Web Consortium (W3C), designed espe- 
cially for Web documents. It allows design- 
ers to create their own custom-tailored tags, 
enabling the definition, transmission, vali- 
dation, and interpretation of data between 
applications and between organizations. 


External router A computer that resides on 
multiple networks and that can forward and 
translate message packets sent from a local 
or enterprise network to a regional network 
beyond the bounds of the organization. 


External validity In the context of clinical 
research, external validity refers to the abil- 
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ity to generalize study results into clinical 
care. 


Extract, Transform, and Load (ETL) ETL is the 
process by which source data is collected and 
manipulated so as to adhere to the structure 
and semantics of a receiving data construct, 
such as a data warehouse. 


Extrinsic evaluation An evaluation of a com- 
ponent of a system based on an evaluation of 
the performance of the entire system. 


F measure A measure of overall accuracy 
that is a combination of precision and recall. 


Factual knowledge Knowledge of facts with- 
out necessarily having any in-depth under- 
standing of their origin or implications. 


False negative A negative result that occurs 
in a true situation. Examples include a desired 
entity that is missed by a search routine or a 
test result that appears normal when it should 
be abnormal. 


False positive A positive result that occurs in 
a false situation. Examples include an inap- 
propriate entity that is returned by a search 
routine or a test result that appears abnormal 
when it should be normal. 


False-negative rate (FNR) The probability of a 
negative result, given that the condition under 
consideration is true—for example, the proba- 
bility of a negative test result in a patient who 
has the disease under consideration. 


False-negative result (FN) A negative result 
when the condition under consideration is 
true—for example, a negative test result in a 
patient who has the disease under consider- 
ation. 


False-positive rate (FPR) The probability of 
a positive result, given that the condition 
under consideration is false—for example, 
the probability of a positive test result in a 
patient who does not have the disease under 
consideration. 
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False-positive result (FP) A positive result 
when the condition under consideration is 
false—for example, a positive test result in a 
patient who does not have the disease under 
consideration. 


Fast Healthcare Interoperability Resource 
(FHIR) An HL7 standard for information 
exchange using a well-defined and limited set 
of resources. 


FDDI See: Fiber Distributed Data Interface. 


Feedback In a computer-based education 
program, system-generated responses, such as 
explanations, summaries, and references, pro- 
vided to further a student’s progress in learn- 
ing. 


Fee-for-service Unrestricted system of health 
care reimbursement in which payers pay 
provider for those services the provider has 
deemed necessary. 


Fiber Distributed Data Interface [FDDI] A 
transmission standard for local area networks 
operating on fiberoptic cable, providing a 
transmission rate of 100 Mbit/s. 


Fiberoptic cable A communication medium 
that uses light waves to transmit information 
signals. 


Fiducial An object used in the field of view of 
an imaging system which appears in the image 
produced, for use as a point of reference or a 
measure. 


Field In science, the setting, which may be 
multiple physical locations, where the work 
under study is carried out. In database design, 
the smallest named unit of data in a database. 
Fields are grouped together to form records. 


Field function study Study of an informa- 
tion resource where the system is used in the 
context of ongoing health care. Study of a 
deployed system (cf. Laboratory study). 


Field user effect study A study of the actual 
actions or decisions of the users of the 
resource. 


File In a database, a collection of similar 
records. 


File format Representation of data within 
a file; can refer to the method for individual 
characters and values (for example, ASCII or 
binary) or their organization within the file 
(for example, XML or text). 


File server A computer that is dedicated to 
storing shared or private data files. 


File system An organization of files within a 
database or on a mass storage device. 


Filtering algorithms A defined procedure 
applied to input data to reduce the effect of 
noise. 


Finite state automaton An abstract, 
computer-based representation of the state of 
some entity together with a set of actions that 
can transform the state. Collections of finite 
state automata can be used to model complex 
systems. 


Fire-wall A security system intended to pro- 
tect an organization’s network against exter- 
nal threats by preventing computers in the 
organization’s network from communicating 
directly with computers external to the net- 
work, and vice versa. 


Flash memory card A portable electronic 
storage medium that uses a semiconductor 
chip with a standard physical interface; a 
convenient method for moving data between 
computers. 


Flexnerian One of science-based acquisition 
of medically relevant knowledge, followed 
by on-the-job apprentice-style acquisition of 
experience, and accompanied by evolution 
and expansion of the curriculum to add new 
fields of knowledge. 
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Floppy disk An inexpensive magnetic disk 
that can be removed from the disk-drive 
unit and thereby used to transfer or archive 
files. 


FM See: Frequency modulation. 


fMRI See: Functional magnetic resonance 
imaging. 


FN See: False-negative result. 


Force feedback A user interface feature in 
which physical sensations are transmitted to 
the user to provide a tactile sensation as part of 
a simulated activity. See also Haptic feedback. 


Foreground question Question that asks 
for general information related to a specific 
patient (see also background question). 


Form factor Typically refers to the physi- 
cal dimensions of a product. With comput- 
ing devices, refers to the physical size of the 
device, often with specific reference to the 
display. For example, we would observe that 
the form factor of a desktop monitor is sig- 
nificantly larger than that of a tablet or smart 
phone, and therefore able to display more 
characters and larger graphics on the screen. 


Formative evaluation An assessment of 
a system’s behavior and capabilities con- 
ducted during the development process and 
used to guide future development of the sys- 
tem. 


Forward chaining Also known as data-driven 
reasoning. A form of inference used in rule- 
based systems in which the inference engine 
uses newly acquired (or concluded) values of 
variables to invoke all rules that may reference 
one or more of those variables in their prem- 
ises (left-hand side), thereby concluding new 
values for variables in the conclusions (right- 
hand side) of those rules. The process contin- 
ues recursively until all rules whose premises 
may reference the variables whose values 
become known have been considered. 


FP See: False-positive result. 
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FPR See: False-positive rate. 


Frame An abstract representation of a con- 
cept or entity that consists of a set of attri- 
butes, called slots, each of which can have one 
or more values to represent knowledge about 
the entity or concept. 


Frame Relay A high-speed network protocol 
designed for sending digital information over 
shared wide-area networks using variable 
length packets of information. 


Free morpheme A morpheme that is a word 
and that does not contain another morpheme 
(e.g., arm, pain). 


Frequency modulation(FM) A signal repre- 
sentation in which signal values are repre- 
sented as changes in frequency rather than 
amplitude. 


Front-end application A computer program 
that interacts with a database-management 
system to retrieve and save data and to accom- 
plish user-level tasks. 


Full-text content The complete textual infor- 
mation contained in a bibliographic source. 


Functional magnetic resonance imaging 
(fMRI) A magnetic resonance imaging method 
that reveals changes in blood oxygenation that 
occur following neural activity. 


Functional mapping An imaging method that 
relates specific sites on images to particular 
physiologic functions. 


Gateway A computer that resides on multiple 
networks and that can forward and translate 
message packets sent between nodes in net- 
works running different protocols. 

Gbps See: Gigabits per second. 

GEM See: Guideline Element Model. 


GenBank A centralized repository of protein, 
RNA, and DNA sequences in all species, cur- 
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rently maintained by the National Institutes 
of Health. 


Gene expression microarray Study the expres- 
sion of large numbers of genes with one 
another and create multiple variations on a 
genetic theme to explore the implications of 
changes in genome function on human disease. 


Gene Expression Omnibus (GEO) A central- 
ized database of gene expression microarray 
datasets. 


Gene Ontology(GO) A structured controlled 
vocabulary used for annotating genes and pro- 
teins with molecular function. The vocabulary 
contains three distinct ontologies, Molecular 
Function, Biological Process and Cellular 
Component. 


Genes Units encoded in DNA and they 
are transcribed into ribonucleic acid (RNA). 


Genetic data An overarching term used to 
label various collections of facts about the 
genomes of individuals, groups or species. 


Genetic risk score (GRS) A calculation of the 
likelihood of a particular phenotype being 
present based on a weighed score of one or 
more genetic variants; also referred to as a 
polygenic risk score (PRS). 


Genome-Wide Association Studies (GWAS) An 
examination of many common genetic vari- 
ants in different individuals to see if any 
variant is associated with a given trait, e.g., a 
disease. 


Genomic medicine (also known as stratified- 
medicine) The management of groups of 
patients with shared biological characteris- 
tics, determined through molecular diagnostic 
testing, to select the best therapy in order to 
achieve the best possible outcome for a given 
group. 


Genomics The study of all of the nucleo- 
tide sequences, including structural genes, 
regulatory sequences, and noncoding DNA 


segments, in the chromosomes of an organ- 
ism. 


Genomics database An organized collec- 
tion of information from gene sequencing, 
protein characterization, and other genomic 
research. 


Genotype The genetic makeup, as distin- 
guished from the physical appearance, of an 
organism or a group of organisms. 


Genotypic Refers to the genetic makeup of 
an organism. 


GEO See: Gene Expression Omnibus. 


Geographic Information System (GIS) A sys- 
tem designed to capture, store, manipulate, 
analyze, manage, and visually present alltypes 
of location-specific data. 


Geographic Information System (GIS) A sys- 
tem designed to capture, store, manipulate, 
analyze, manage, and visually present alltypes 
of location-specific data. 


Gigabits per second (Gbps) A common unit 
of measure for data transmission over high- 
speed networks. 


Gigabyte 2% or 1,073,741,824 bytes. 
GIS See: Geographic Information System. 


Global processing Computations on the 
entire image, without regard to specific 
regional content. 


GO See: Gene Ontology. 


Gold-standard test The test or procedure 
whose result is used to determine the true state 
of the subject—for example, a pathology test 
such as a biopsy used to determine a patient’s 
true disease state. 


Google A commercial search engine that 
provides free searching of documents on the 
World Wide Web. 
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GPS A system for calculating precise geo- 
graphical location by triangulating informa- 
tion obtained from satellites and/or cell towers. 


GPU See: Graphics processing unit. 


Grammar A mathematical model of a poten- 
tially infinite set of strings. 


Graph In computer science, a set of nodes or 
circles connected by a set of edges or lines. 


Graphical user interface (GUI) A type of envi- 
ronment that represents programs, files, and 
options by means of icons, menus, and dialog 
boxes on the screen. 


Graphics processing unit (GPU) A computer 
hardware component that performs graphic 
displays and other highly parallel computa- 
tions. 


Gray scale A scheme for representing inten- 
sity in a black-and-white image. Multiple bits 
per pixel are used to represent intermediate 
levels of gray. 


Guardian Angel Proposal A proposed struc- 
ture for a lifetime, patient-centered health 
information system. 


GUI See: Graphical user interface. 


Guidance In a computer-based education 
program, proactive feedback, help facilities, 
and other tools designed to assist a student in 
learning the covered material. 


Guideline Element Model (GEM) An XML 
specification for marking up textual docu- 
ments that describe clinical practice guide- 
lines. The guideline-related XML tags make it 
possible for information systems to determine 
the nature of the text that has been marked up 
and its role in the guideline specification. 

Genome-Wide 


GWAS See: Association 


Studies. 


Haptic feedback A user interface feature in 
which physical sensations are transmitted to 
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the user to provide a tactile sensation as part 
of a simulated activity. 


Haptic sensation The sensation of touch or 
feel. It can be applied to a simulation of such 
sensation as presented within a virtual or aug- 
mented reality scenario. 


Hard disk A magnetic disk used for data 
storage and typically fixed in the disk-drive 
unit. 


Hardware The physical equipment of a com- 
puter system, including the central processing 
unit, memory, data-storage devices, worksta- 
tions, terminals, and printers. 


Harmonic mean An average of a set of 
weighted values in which the weights are 
determined by the relative importance of the 
contribution to the average. 


HCI See: Human-computer interaction. 
HCO See: Healthcare organization. 


Head word The key word in a multi-word 
phrase that conveys the central meaning of 
the phrase. For example, a phrase containing 
adjectives and a noun, the noun is typically 
the head word. 


Header (of email) The portion of a simple 
electronic mail message that contains informa- 
tion about the date and time of the message, 
the address of the sender, the addresses of 
the recipients, the subject, and other optional 
information. 


Health Evaluation and Logical Processing 
[HELP] On of the first electronic health record 
systems, developed at LDS Hospital in Sal 
Lake City, Utah. Still in use today, it was 
innovative for its introduction of automated 
alerts. 


Health informatics Used by some as a syn- 
onym for biomedical informatics, this term 
is increasingly used solely to refer to applied 
research and practice in clinical and public 
health informatics. 
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Health information and communication tech- 
nology (HICT) The broad spectrum of hard- 
ware and software used to capture, store and 
transmit health information. 


Health Information exchange (HIE) The 
process of moving health information elec- 
tronically among disparate health care orga- 
nizations for clinical care and other purposes; 
or an organization that is dedicated to provid- 
ing health information exchange. 


Health Information Infrastructure (HII) The 
set of public and private resources, including 
networks, databases, and policies, for collect- 
ing, storing, and transmitting health informa- 
tion. 


Health Information Technology (HIT) These of 
computers and communications technology 
in healthcare and public health settings. 


Health Information Technology for Economic 
and Clinical Health (HITECH) Also referred 
to as HITECH Act. Passed by the Congress 
as Title IV of the American Recovery and 
Reinvestment Act of 2009 (ARRA) in 2009, 
established four major goals that promote 
the use of health information technology: 
(1) Develop standards for the nationwide 
electronic exchange and use of health infor- 
mation; (2) Invest $20B in incentives to 
encourage doctors and hospitals to use HIT 
to electronically exchange patients’ health 
information; (3) Generate $10B in savings 
through improvements in quality of care 
and care coordination, and reductions in 
medical errors and duplicative care and (4) 
Strengthen Federal privacy and security law 
to protect identifiable health information 
from misuse. Also codified the Office of the 
National Coordinator for Health Information 
Technology (ONC) within the Department of 
Health and Human Services. 


Health Insurance Portability and Accoun- 
tability Act (HIPAA) A law enacted in 1996 to 
protect health insurance coverage for workers 
and their families when they change or lose 
their jobs. An “administrative simplification” 
provision requires the Department of Health 
and Human Services to establish national 


standards for electronic healthcare transac- 
tions and national identifiers for providers, 
health plans, and employers. It also addresses 
the security and privacy of health data. 


Health Level Seven (HL7) An ad hoc stan- 
dards group formed to develop standards for 
exchange of health care data between inde- 
pendent computer applications; more specifi- 
cally, the health care data messaging standard 
developed and adopted by the HL7 standards 


group. 


Health literacy A constellation of skills, 
including the ability to perform basic reading, 
math, and everyday health tasks like compre- 
hending prescription bottles and appointment 
slips, required to function in the health care 
environment. 


Health Maintenance Organization (HMO) A 
group practice or affiliation of independent 
practitioners that contracts with patients to 
pro- vide comprehensive health care for a 
fixed periodic payment specified in advance. 


Health on the Net[HON] A private organiza- 
tion establishing ethical standards for health 
information published on the World Wide 
Web. 


Health Record Bank (HRB) An independent 
organization that provides a secure elec- 
tronic repository for storing and maintaining 
an individual’s lifetime health and medical 
records from multiple sources and assuring 
that the individual always has complete con- 
trol over who accesses their information. 


Healthcare Effectiveness Data and Information 
Set (HEDIS) Employers and individuals use 
HEDIS to measure the quality of health 
plans. HEDIS measures how well health plans 
give service to their members. HEDIS is one 
of health care’s most widely used performance 
improvement tools. It is developed and main- 
tained by the National Committee for Quality 
Assurance. 


Healthcare organization (HCO) Any 
healthrelated organization that is involved in 
direct patient care. 
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Healthcare team A coordinated group of 
health professionals including physicians, 
nurses, case managers, dieticians, pharma- 
cists, therapists, and other practitioners who 
collaborate in caring for a patient. 


HEDIS See: Healthcare Effectiveness 
and Information Set. 


Data 


HELP See Health Evaluation and Logical 
Processing. 


HELP sector A decision rule encoded in the 
HELP system, a clinical information system 
that was developed by researchers at LDS 
Hospital in Salt Lake City. 


Helper (plug- in) An application that are 
launched by a Web browser when the browser 
downloads a file that the browser is not able 
to process itself. 


Heuristic A mental “trick” or rule of thumb; 
a cognitive process used in learning or prob- 
lem solving. 


Heuristic evaluation (HE) A usability inspec- 
tion method, in which the system is evaluated 
on the basis of a small set of well-tested design 
principles such as visibility of system status, 
user control and freedom, consistency and 
standards, flexibility and efficiency of use. 


HICT See: Health information and commu- 
nication technology. From standard of care 
practices), so as to provide the basis for com- 
parison to data sets derived from participants 
who have received an experimental interven- 
tion under study. 


HIE See: Health Information Exchange. 
HIE See: Health Information Exchange. 
Hierarchical An arrangement between enti- 
ties that conveys some superior-inferior rela- 
tionship, such as parent-child, whole-part etc. 
Hierarchical Task Analysis Task analytic 


approach that involves the breaking down of 
a task into sub-tasks and smaller constituted 
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parts (e.g., sub-sub-tasks). The tasks are orga- 
nized according to specific goals. 


High-bandwidth An information channel 
that is capable of carrying delivering data at a 
relatively high rate. 


Higher-level process A complex process com- 
prising multiple lower-level processes. 


HII See: Health Information Infrastructure. 
HII See: Health Information Infrastructure. 


Hindsight bias The tendency to over-estimate 
the prior predictability of an event, once the 
events has already taken place. For example, 
if event A occurs before event B, there may be 
an assumption that A predicted B. 


HIPAA See: Health Insurance Portability and 
Accountability Act. 


HIPAA See: Health Insurance Portability and 
Accountability Act. 


HIS See: Hospital information system. 


Historical control In the context of clinical 
research, historical controls are subjects who 
represent the targeted population of interest 
for a study. Typically, their data are derived 
from existing resources in a retrospective 
manner and that represent targeted outcomes 
in a non-interventional state (often resulting 
among humans and other elements of a sys- 
tem, and the profession that applies theory, 
principles, data, and other methods to design 
in order to optimize human well-being and 
overall system performance. 


Historically controlled study See: before-after 
study. 


HIT See: Health Information Technology. 


HITECH See: Health Information Technology 
for Economic and Clinical Health. 


HITECH regulations The components of 
the Health Information Technology for 
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Economic and Clinical Health Act, passed by 
the Congress in 2009, which authorized finan- 
cial incentives to be paid to eligible physicians 
and hospitals for the adoption of “meaning- 
ful use” of EHRs in the United States. The 
law also called for the certification of EHR 
technology and for educational programs to 
enhance its dissemination and adoption. 


HIV See: Human immunodeficiency virus. 
HL7 See: Health Level 7. 
HMO See: Health maintenance organization. 


Home Telehealth The extension of tele- 
health services in to the home setting to sup- 
port activities such as home nursing care and 
chronic disease management. 


HON See: Health on the Net. 


Hospital information system (HIS) Computer 
system designed to support the comprehen- 
sive information requirements of hospitals 
and medical centers, including patient, clin- 
ical, ancillary, and financial management. 


Hot fail over A secondary computer system 
that is kept in constant synchronization with 
the primary system and that can take over as 
soon as the primary fails for any reason. 


Hounsfield number The numeric information 
contained in each pixel of a CT image. It is 
related to the composition and nature of the 
tissue imaged and is used to represent the den- 
sity of tissue. 

HRB See: Health Record Bank. 

HTML See HyperText. 

HTTP See: HyperText Transfer Protocol. 


Human factors The scientific discipline con- 
cerned with the understanding of interactions. 


Human Genome Project An international 
undertaking, the goal of which is to deter- 


mine the complete sequence of human deoxy- 
ribonucleic acid (DNA), as it is encoded in 
each of the 23 chromosomes. 


Human immunodeficiency virus (HIV) A ret- 
rovirus that invades and inactivates helper T 
cells of the immune system and is a cause of 
AIDS and AIDS-related complex. 


Human-computer interaction (HCI) Formal 
methods for addressing the ways in which 
human beings and computer programs 
exchange information. 


Hyper Text markup language (HTML) The 
document specification language used for doc- 
uments on the World Wide Web. 


Hypertext Text linked together in a 
non sequential web of associations. Users 
can traverse highlighted portions of text to 
retrieve additional related information. 


HyperText Transfer Protocol (HTTP) The cli- 
ent-server protocol used to access informa- 
tion on the World Wide Web. 


Hypothesis generation The process of pro- 
posing a hypothesis, usually driven by some 
unexplained phenomenon and the derivation 
of a suspected underlying mechanism. 


Hypothetico-deductive approach A method 
of reasoning made up of four stages (cue 
acquisition, hypothesis generation, cue inter- 
pretation, and hypothesis evaluation) which is 
used to generate and test hypotheses. In clini- 
cal medicine, an iterative approach to diag- 
nosis in which physicians perform sequential, 
staged data collection, data interpretation, 
and hypothesis generation to determine and 
refine a differential diagnosis. 


Hypothetico-deductive reasoning Reasoning 
by first generating and then testing a set of 
hypotheses to account for clinical data (i.e., 
reasoning from hypothesis to data). 
ICANN See: Internet Corporation for 
Assigned Names and Numbers. 
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ICD-9-CM See: International Classification of 
Diseases, 9th Edition, Clinical Modifications. 


ICMP See: Internet Control Message Protocol. 


Icon In a graphical interface, a pictorial rep- 
resentation of an object or function. 


ICT See: Information and communications 
technology. 


IDF See: Inverse document frequency. 
IDN See: Integrated delivery network. 


Image acquisition The process of generat- 
ing images from the modality and converting 
them to digital form if they are not intrinsi- 
cally digital. 


Image compression A mathematical pro- 
cess for removing redundant or relatively 
unimportant information from an electronic 
image such that the resulting file appears the 
same (lossless compression) or similar (lossy 
compression) when compared to the origi- 
nal. 


Image content representation Makes the 
infor-mation in images accessible to machines 
for processing. 


Image database An organized collection 
of clinical image files, such as x-rays, photo- 
graphs, and microscopic images. 


Image enhancement The use of global pro- 
cessing to improve the appearance of the 
image either for human use or for subsequent 
processing by computer. 


Image interpretation/computer reasoning The 
process by which the individual viewing the 
image renders an impression of the medi- 
cal significance of the results of imaging 
study, potentially aided by computer methods. 


Image management/storage Methods for 
storing, transmitting, displaying, retrieving, 
and organizing images. The application of 
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methods for storing, transmitting, displaying, 
retrieving, and organizing images. 


Image metadata Data about images, such as 
the type of image (e.g., modality), patient that 
was imaged, date of imaging, image features 
(quantitative or qualitative), and other infor- 
mation pertaining to the image and its con- 
tents. 


Image processing The transformation of 
one or more input images, either into one 
or more output images, or into an abstract 
representation of the contents of the input 
images. 


Image quantitation The process of extracting 
useful numerical parameters or deriving cal- 
culations from the image or from ROIs in the 
image. 


Image reasoning Computerized methods 
that use images to formulate conclusions or 
answer questions that require knowledge and 
logical inference. 


Image rendering/visualization A variety 
of techniques for creating image displays, 
diagrams, or animations to display images 
more in a different perspective from the raw 
images. 


Imaging informatics A subdiscipline of medi- 
cal informatics concerned with the common 
issues that arise in all image modalities and 
applications once the images are converted to 
digital form. 


IMIA See: International Medical Informatics 
Association. 


Immersive and virtual environments A com- 
puter-based set of sensory inputs and outputs 
that can give the illusion of being in a differ- 
ent physical environment. 


Immersive environment A computer-based 
set of sensory inputs and outputs that can 
give the illusion of being in a different physi- 
cal environment; see; Virtual Reality. 
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Immersive simulated environment A com- 
puter-based set of sensory inputs and outputs 
that can give the illusion of being in a differ- 
ent physical environment. 


Immersive simulated environment A teach- 
ing environment in which a student manipu- 
lates tools to control simulated instruments, 
producing visual, pressure, and other feed- 
back to the tool controls and instruments. 


Immunization Information System (IIS) Con- 
fidential, population based, computerized 
databases that record all immunization 
doses administered by participating provid- 
ers to persons residing within a given geo- 
political area. Also known as Immunization 
Registries. 


Immunization Registry Confidential, popu- 
lation based, computerized databases that 
record all immunization doses administered 
by participating providers to persons residing 
within a given geopolitical area. Also known 
as Immunization Information Systems. 


Implementation science Implementation 
science refers to the study of socio-cultural, 
operational, and behavioral norms and pro- 
cesses surrounding the dissemination and 
adoption of new systems, approaches and/or 
knowledge. 


Inaccessibility A property of paper records 
that describes the inability to access the record 
by more than one person or in more than one 
place at a time. 


Incrementalist An approach to evaluation 
that tolerates ambiguity and uncertainty and 
allows changes from day-to-day. 


Independent Two events, A and B, are con- 
sidered independent if the occurrence of one 
does not influence the probability of the occur- 
rence of the other. Thus, p[A | B] = p[A]. The 
probability of two independent events A and 
B both occurring is given by the product of the 
individual probabilities: p[A,B] = p[A] x p[B]. 
(See conditional independence.). 


Independent variable In a correlational or 
experimental study, a variable thought to 
determine or be associated with the value of 
the dependent variable (q.v.). 


Index In information retrieval, a shorthand 
guide to the content that allows users to find 
relevant content quickly. 


Index Medicus The printed index used to 
catalog the medical literature. Journal articles 
are indexed by author name and subject head- 
ing, then aggregated in bound volumes. The 
Medline database was originally con- structed 
as an online version of the Index Medicus. 


Index test The diagnostic test whose perfor- 
mance is being measured. 


Indexing In information retrieval, the assign- 
ment to each document of specific terms that 
indicate the subject matter of the document 
and that are used in searching. 


Indirect-care Activities of health profession- 
als that are not directly related to patient care, 
such as teaching and supervising students, 
continuing education, and attending staff 
meetings. 


Inductive reasoning Involves an inferential 
process from the observed data to account 
for the unobserved. It is a process of gener- 
ating possible conclusions based on available 
data. For example, the fact that a patient who 
recently had major surgery has not had any 
fever for the last 3 days may lead us to con- 
clude that he will not have fever tomorrow or 
in the immediate days that follow. The power 
of inductive reasoning lies in its ability to 
allow us to go beyond the limitations of our 
current evidence or knowledge to novel con- 
clusions about the unknown. 


Inference engine A computer program that 
reasons about a knowledge base. In the case 
of rule-based systems, the inference engine 
may perform forward chaining or backward 
chaining to enable the rules to infer new infor- 
mation about the current situation. 
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Inflectional morpheme A morpheme that cre- 
ates a different form of a word without chang- 
ing the meaning of the word or the part of 
speech (e.g., —ed, —s, —ing as in activated, acti- 
vates, activating.). 


Influence diagram A belief network in which 
explicit decision and utility nodes are also 
incorporated. 


Infobutton A context-specific link from 
health care application to some information 
resource that anticipates users’ needs and pro- 
vides targeted information. 


Infobutton manager Middleware that pro- 
vides a standard software interface between 
infobuttons in an EHR and the documents 
and other information resources that the 
infobuttons may display for the user. 


infoRAD The information technology and 
computing oriented component of the very 
large exhibition hall at the annual meet- 
ing of the Radiological Society of North 
America. 


Information Organized data from which 
knowledge can be derived and that accord- 
ingly provide a basis for decision making. 


Information and communications technology 
(ICT) The use of computers and commu- 
nications devices to accept, store, transmit, 
and manipulate data; the term is roughly a 
synonym for information technology, but it 
is used more often outside the United States. 


Information blocking A practice or position 
that interferes with exchange or accessibility 
of patient data or electronic health informa- 
tion. This was defined by the 21st Century 
Cures Act. 


Information extraction Methods that process 
text to capture and organize specific informa- 
tion in the text and also to capture and orga- 
nize specific relations between the pieces of 
information. 
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Information model A representation of con- 
cepts, relationships, constraints, rules, and oper- 
ations to specify data semantics for a chosen 
domain of discourse. It can provide sharable, 
stable, and organized structure of information 
requirements for the domain context. 


Information need In information retrieval, the 
searchers’ expression, in their own language, 
of the information that they desire. 


Information resource Generic term for a 
computer-based system that seeks to enhance 
health care by providing patient-specific infor- 
mation directly to care providers (often used 
equivalently with “system”). 


Information retrieval (IR) Methods that effi- 
ciently and effectively search and obtain data, 
particularly text, from very large collections 
or databases. It is also the science and practice 
of identification and efficient use of recorded 
media. See also Search. 


Information science The field of study con- 
cerned with issues related to the management 
of both paper-based and electronically stored 
information. 


Information theory The theory and math- 
ematics underlying the processes of commu- 
nication. 


Information visualization Theuse of computer- 
supported, interactive, visual representations 
of abstract data to amplify cognition. 


Ink-jet printer Output device that uses a 
moveable head to spray liquid ink on paper; 
the head moves back and forth for each line 
of pixels. 


Input and Output Devices, such as keyboards, 
pointing devices, video displays, and laser 
printers, that facilitate user interaction and 
storage or just 


Input The data that represent state informa- 
tion, to be stored and processed to produce 
results (output). 
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Inspection method Class of usability evalu- 
ation methods in which experts appraising a 
system, playing the role of a user to identify 
potential usability and interaction issues with 
a system. 


Institute of Medicine The health arm of the 
National Academy of Sciences, which pro- 
vides unbiases, authoritative advice to deci- 
sion makers and the public. Renamed the 
National Academy of Medicine in 2016. 


Institutional Review Board (IRB) A commit- 
tee responsible for reviewing an institution’s 
research projects involving human subjects in 
order to protect their safety, rights, and wel- 
fare. 


Integrated circuit A circuit of transistors, 
resistors, and capacitors constructed on a 
single chip and interconnected to perform a 
specific function. 


Integrated delivery network (IDN) A large 
conglomerate health-care organization devel- 
oped to provide and manage comprehensive 
health-care services. 


Integrated Service Digital Network (ISDN) A 
digital telephone service that allows high- 
speed network communications using conven- 
tional (twisted pair) telephone wiring. 


Integrative model Model for understanding 
a phenomenon that draws from multiple dis- 
ciplines and is not necessarily based on first 
principles. 


Intellectual property Software programs, 
knowledge bases, Internet pages, and other 
creative assets that require protection against 
copying and other unauthorized use. 


Intelligent system See: knowledge-based sys- 
tem. 


Intelligent Tutor A tutoring system that mon- 
itors the learning session and intervenes only 
when the student requests help or makes seri- 
ous mistakes. 


Interactome The set of all molecular interac- 
tions in a cell. 


Interface engine Software that mediates 
the exchange of information among two or 
more systems. Typically, each system must 
know how to communicate with the interface 
engine, but not need to know the information 
format of the other systems. 


Intermediate effect process of continu- 
ally learning, re-learning, and exercising new 
knowledge, punctuated by periods of appar- 
ent decrease in mastery and declines in perfor- 
mance, which may be necessary for learning 
to take place. People at intermediate levels of 
expertise may perform more poorly than those 
at lower level of expertise on some tasks, due to 
the challenges of assimilating new knowledge or 
skills over the course of the learning process. 


Internal validity In the context of clinical 
research, internal validity refers to the mini- 
mization of potential biases during the design 
and execution of the trial. 


International Classification of Diseases, 9th 
Edition, Clinical Modifications A US exten- 
sion of the World Health Organization’s 
International Classification of Diseases, 9th 
Edition. 


International Medical Informatics Association 
(IMIA) An international organization dedicated 
to advancing biomedical and health informat- 
ics; an “organization of organizations”, it’s 
members are national informatics societies and 
organizations, such as AMIA. 


International Organization for Standards 
(ISO) The international body for information 
and other standards. 


Internet A worldwide collection of gateways 
and networks that communicate with each 
other using the TCP/IP protocol, collectively 
providing a range of services including elec- 
tronic mail and World Wide Web access. 

Internet Protocol 


Internet address See 


Address. 
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Internet Control Message Protocol (ICMP) A 
network-level Internet protocol that provides 
error correction and other information rel- 
evant to processing data packets. 


Internet Corporation for Assigned Names and 
Numbers (ICANN) The organization respon- 
sible for managing Internet domain name and 
IP address assignments. 


Internet of Things (loT) A system of intercon- 
nected computing devices that can transfer 
data and be controlled over a network. In the 
consumer space, IoT technologies are most 
commonly found in the built environment 
where devices and appliances (such as lighting 
fixtures, security systems or thermostats) can 
be controlled via smartphones or smart speak- 
ers, creating “smart” homes or offices. 


Internet protocol The protocol within TCP/ 
IP that governs the creation and routing of 
data packets and their reassembly into data 
messages. 


Internet Protocol address A 32-bit number 
that uniquely identifies a computer connected 
to the Internet. Also called “Internet address” 
or “IP address”. 


Internet service provider (ISP) A commercial 
communications company that supplies fee- 
for-service Internet connectivity to individu- 
als and organizations. 


Internet standards The set of conventions 
and protocols all Internet participants use to 
enable effective data communications. 


Internet Support Group (ISG) An on-line 
forum for people with similar problems, 
challenges or conditions to share supportive 
resources. 


Interoperability The 21st Century Cures Act 
defines interoperability as health informa- 
tion technology that—(A) enables the secure 
exchange of electronic health information 
with, and use of electronic health informa- 
tion from, other health information technol- 
ogy without special effort on the part of the 
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user; (B) allows for complete access, exchange, 
and use of all electronically accessible health 
information for authorized use under appli- 
cable State or Federal law; and (C) does not 
constitute information blocking. 


Interpreter A program that converts each 
statement in a high-level program to a 
machine- language representation and then 
executes the binary instruction(s). 


Interventional radiology A subspecialty of 
radiology that uses imaging to guide invasive 
diagnostic or therapeutic procedures. 


Intrinsic evaluation An evaluation of a com- 
ponent of a system that focuses only on the 
performance of the component. See also 
Extrinsic Evaluation. 


Intuitionist-pluralist or de-constructivist A 
philosophical position that holds that there is 
no truth and that there are as many legitimate 
interpretations of observed phenomena as 
there are observers. 


Inverse document frequency (IDF) A measure 
of how infrequently a term occurs in a docu- 
ment collection. 


IDFi= ie number of documents ) 


number of documents with term 
IOM See: Institute of Medicine. 

IP address See: Internet Protocol Address. 

IR See: Information retrieval. 

IRB See: Institutional Review Board. 

ISDN See: Integrated Service Digital Network. 
ISG See: Internet Support Group. 


ISO See: InternationalOrganization for Stan- 
dards. 


Iso-semantic mapping A relationship bet- 
ween an entity in one dataset or model and an 
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entity in another dataset or model where the 
meaning of the two entities is identical, even 
if the syntax or lexical form is different. 


ISP See: Internet service provider. 


Job A set of tasks submitted by a user for 
processing by a computer system. 


Joint Commission (JC) An independent, not- 
for-profit organization, The Joint Comm- 
ission accredits and certifies more than 19,000 
health care organizations and pro- grams in 
the United States. Joint Commission accredi- 
tation and certification is recognized nation- 
wide as a symbol of quality that reflects 
an organization’s commitment to meeting 
certain performance standards. The Joint 
Commission was formerly known as JCAHO 
(the Joint Commission for the Accreditation 
of Healthcare Organizations). 


Just-in-time adaptive interventions 
(JITAls) An intervention design that aims to 
provide the type of support that is most likely 
to be helpful in a particular context at times 
when users are most likely to be receptive to 
that support, by adapting intervention provi- 
sion to an individual’s changing internal and 
contextual state. 


Just-in-time learning An approach to pro- 
viding necessary information to a user at the 
moment it is needed, usually through antici- 
pation of the need. 


Kernel The core of the operating system 
that resides in memory and runs in the back- 
ground to supervise and control the execution 
of all other programs and direct operation of 
the hardware. 


Key field A field in the record of a file that 
uniquely identifies the record within the file. 


Key Performance Indicator (KPI) A metric 
defined to be an important factor in the suc- 
cess of an organization. Typically, several Key 
Performance indicators are displayed on a 
Dashboard. 


Keyboard A data-input device used to enter 
alphanumeric characters through typing. 


Keyword A word or phrase that conveys spe- 
cial meaning or to refer to information that is 
relevant to such a meaning (as in an index). 


Kilobyte 2!° or 1024 bytes. 


Knowledge Relationships, facts, assump- 
tions, heuristics, and models derived through 
the formal or informal analysis (or interpreta- 
tion) of observations and resulting informa- 
tion. 


Knowledge acquisition The information- 
elicitation and modeling process by which 
developers interact with subject-matter 
experts to create electronic knowledge bases. 


Knowledge base A collection of stored facts, 
heuristics, and models that can be used for 
problem solving. 


Knowledge graph A kind of knowledge rep- 
resentation in which entities are encoded as 
nodes in a graph and relationships among 
entities are encoded as links between the 
nodes. 


Knowledge-based information Information 
derived and organized from observational or 
experimental research. 


Knowledge-based system A program that 
symbolically encodes, in a knowledge base, 
facts, heuristics, and models derived from 
experts in a field and uses that knowledge to 
provide problem analysis or advice that the 
expert might have provided if asked the same 
question. 


KPI See: Key Performance Indicator. 


Laboratory function study Study that 
explores important properties of an informa- 
tion resource in isolation from the clinical set- 
ting. 
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Laboratory user effect study An evaluation 
technique in which a user is observed when 
given a simulated task to perform. 


LAN See: Local-area network. 


Laser printer Output device that uses an 
electromechanically controlled laser beam to 
generate an image on a xerographic surface, 
which then is used to produce paper copies. 


Latency The time required for a signal to 
travel between two points in a network. 


Latent failures Enduring systemic problems 
that make errors possible but are less visible 
or not evident for some time. 


Law of proximity Principle from Gestalt psy- 
chology that states that visual entities that are 
close together are perceptually grouped. 


Law of symmetry Principle from Gestalt psy- 
chology that states that symmetric objects are 
more readily perceived. 


LCD See: Liquid crystal display. 


Lean A management strategy that focuses 
only on those process that are able to contrib- 
ute specific and measurable value for the end 
customer. The LEAN concept originated with 
Toyota’s focus on efficient manufacturing pro- 
cesses. 


Learning Content Management System A 
software platform that allows educational 
content creators to host, manage, and track 
changes in content. 


Learning health system A proposed model 
for health care in which outcomes from past 
and current patient care provide are system- 
atically collected, analyzed and then fed back 
into decision making about best practices for 
future patient care. 


Learning healthcare system See: Learning 
health system. 
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Learning Management System An LMS is 
a repository of educational content, and 
interface for delivering courses and content 
to learners, and a vehicle for faculty to track 
learner usage and performance. 


LED See: Light-emitting diode. 


Lexemes A minimal lexical unit in a language 
that represents different forms of the same 
word. 


Lexical-statistical retrieval Retrieval based 
on a combination of word matching and rel- 
evance ranking. 


Lexicon A catalogue of the words in a lan- 
guage, usually containing syntactic informa- 
tion such as parts of speech, pluralization 
rules, etc. 


Light-emitting diode (LED) A semiconductor 
device that emits a particular frequency of 
light when a current is passed through it; typi- 
cally used for indicator lights and computer 
screens because low power requirement, mini- 
mal heat generated, and durability. 


Likelihood ratio (LR) A measure of the dis- 
criminatory power of a test. The LR is the 
ratio of the probability of a result when 
the condition under consideration is true to 
the probability of a result when the condition 
under consideration is false (for example, the 
probability of a result in a diseased patient to 
the probability of a result in a non-diseased 
patient). The LR for a positive test is the ratio 
of true-positive rate (TPR) to false-positive 
rate (FPR). 


Link-based An indexing approach that gives 
relevance weight to web pages based on how 
often they are cited by other pages. 


Linux An open source operating system based 
on principles of Unix and first developed by 
Linus Torvalds in 1991. 


Liquid crystal display (LCD) A display technol- 
ogy that uses rod-shaped molecules to bend 
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light and alter contrast and viewing angle to 
produce images. 


Listserver A distribution list for electronic 
mail messages. 

Literature reference database See: biblio- 
graphic database. 


Local-area network (LAN) A network for data 
communication that connects multiple nodes, 
all typically owned by a single inst tution and 
located within a small geographic area. 


Logical Observations, Identifiers, Names and 
Codes (LOINC) A controlled terminology 
created for providing coded terms for obser- 
vational procedures. Originally focused on 
laboratory tests, it has expanded to include 
many other diagnostic procedures. 


Logical positivist A philosophical position 
that holds that there is a single truth that can 
be inferred from the right combination of 
studies. 


Logic-based A knowledge representation 
method based on the use of predicates. 


LOINC See: Logical Observations, Identifiers, 
Names and Codes. 


Longitudinal Care Plan A holistic, dynamic, 
and integrated plan that documents impor- 
tant disease prevention and treatment goals 
and plans. A longitudinal plan is patient- 
centered, reflecting a patient’s values and pref- 
erences, and is dependent upon bidirectional 
communications. 


Long-term memory The part of memory that 
acquires information from short-term mem- 
ory and retains it for long periods of time. 


Long-term storage A medium for storing 
information that can persist over long periods 
with- out the need for a power supply to main- 
tain data integrity. 


Lossless compression A mathematical tech- 
nique for reducing the number of bits needed 


to store data while still allowing for the re- cre- 
ation of the original data. 


Lossy compression A mathematical tech- 
nique for reducing the number of bits needed 
to store data but that results in loss of infor- 
mation. 


Low-level processes An elementary process 
that has its basis in the physical world of 
chemistry or physics. 


LR See: Likelihood ratio. 


Machine code The set of primitive instruc- 
tions to a computer represented in binary 
code (machine language). 


Machine language The 
instructions represented 
(machine code). 


set of primitive 
in binary code 


Machine learning A computing technique in 
which information learned from data is used 
to improve system performance. 


Machine translation Automatic mapping of 
text written in one natural language into text 
of another language. 


Macros A reusable set of computer instruc- 
tions, generally for a repetitive task. 


Magnetic disk A round, flat plate of mate- 
rial that can accept and store magnetic 
charge. Data are encoded on magnetic 
disk as sequences of charges on concentric 
tracks. 


Magnetic resonance imaging (MRI) A moda- 
ity that produces images by evaluating the 
differential response of atomic nucleli in the 
body when the patient is placed in an intense 
magnetic field. 


Magnetic resonance spectroscopy A nonin- 
vasive technique that is similar to magnetic 
resonance imaging but uses a stronger field 
and is used to monitor body chemistry (as in 
metabolism or blood flow) rather than ana- 
tomical structures. 
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Magnetictape A long ribbon of material that 
can accept and store magnetic charge. Data 
are encoded on magnetic tape as sequences of 
charges along longitudinal tracks. 


Magnetoencephalography (MEG) A method 
for measuring the electromagnetic fields gen- 
erated by the electrical activity of the neurons 
using a large arrays of scalp sensors, the out- 
put of which are processed in a similar way to 
CT in order to localize the source of the elec- 
tromagnetic and metabolic shifts occurring in 
the brain during trauma. 


Mailing list A set of mailing addresses used 
for bulk distribution of electronic or physical 
mail. 


Mainframe computer system A large, expen- 
sive, multi-user computer, typically operated 
and maintained by professional computing 
personnel. Often referred to as a “mainframe” 
for short. 


Malpractice Class of litigation in health 
care based on negligence theory; failure of 
a health professional to render proper ser- 
vices in kee ing with the standards of the 
community. 


Malware Software that is specifically design to 
cause harm to computer systems by disrupt- 
ing other programs, damaging the machine, or 
gaining unauthorized access to the system or 
the data that it contains. 


Management The process of treating a 
patient (or allowing the condition to resolve 
on its own) once the medical diagnosis has 
been determined. 


Mannequin A life size plastic human body 
with some or many human-like functions. 


Manual indexing The process by which 
human indexers, usually using standardized 
terminology, assign indexing terms and attri- 
butes to documents, often following a specific 
protocol. 
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Markov cycle The period of time specified 
for a transition probability within a Markov 
model. 


Markov model A mathematical model of a set 
of strings in which the probability of a given 
symbol occurring depends on the identity of 
the immediately preceding symbol or the two 
immediately preceding symbols. Processes 
modeled in this way are often called Markov 
processes. 


Markov process A mathematical model of 
a set of strings in which the probability of a 
given symbol occurring depends on the iden- 
tity of the immediately preceding symbol or 
the two immediately preceding symbols. 


Markup language A document specification 
language that identifies and labels the compo- 
nents of the document’s contents. 


Massively Online Open Course (MOOC) In a 
traditional MOOC, the teacher’s content is 
digitally recorded and made available online, 
freely, as a sequence of lectures with support- 
ing learning material. 


Master patient index (MPI) A database that 
is used across a healthcare organization to 
maintain consistent, accurate, and current 
demographic and essential clinical data on the 
patients seen and managed within its various 
departments. 


Mean average precision (MAP) A method for 
measuring overall retrieval precision in which 
precision is measured at every point at which a 
relevant document is obtained, and the MAP 
measure is found by averaging these points for 
the whole query. 


Mean time between failures (MTBF) The 
average predicted time interval between anti 
ipated operational malfunctions of a system, 
based on long-term observations. 


Meaningfuluse The set of standards defined 
by the Centers for Medicare & Medicaid 
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Services (CMS) Incentive Programs that 
governs the use of electronic health records 
and allows eligible providers and hospitals 
to earn incentive payments by meeting spe- 
cific criteria. The term refers to the belief 
that health care providers using electronic 
health records in a meaningful, or effec- 
tive, way will be able to improve health care 
quality and efficiency. 


Measurement study Study to determine the 
extent and nature of the errors with which a 
measurement is made using a specific instru- 
ment (cf. Demonstration study). 


Measures of concordance Measures of agree- 
ment in test performance: the true-positive 
and true-negative rates. 


MedBiquitous A healthcare-specific stan- 
dards consortium led by Johns Hopkins 
Medicine. 


Medical computer science The subdivision of 
computer science that applies the methods of 
computing to medical topics. 


Medical computing The application of meth- 
ods of computing to medical topics (see medi- 
cal computer science). 


Medical entities dictionary (MED) A com- 
pendium of terms found in electronic medi- 
cal record systems. Among the best known 
MEDs is that developed and maintained by 
the Columbia University Irving Medical 
Center and Columbia University. Contains in 
excess of 100,000 terms. 


Medical errors Errors or mistakes, committed 
by health professionals, that hold the poten- 
tial to result in harm to the patient. 


Medical home A primary care practice that 
will maintain a comprehensive problem list to 
make fully informed decisions in coordinating 
their care. 


Medical informatics An earlier term for the 
biomedical informatics discipline, medical 


informatics is now viewed as the subfield of 
clinical informatics that deals with the manage- 
ment of disease and the role of physicians. 


Medical Information Bus (MIB) A data- 
communication system that supports data 
acquisition from a variety of independent 
devices. 


Medical information science The field of 
study concerned with issues related to the 
management and use of biomedical informa- 
tion (see also biomedical informatics). 


Medical Literature Analysis and Retrieval 
System (MEDLARS) The initial electronic 
version of Index Medicus developed by the 
National Library of Medicine. 


Medical Logic Module (MLM) A single chunk 
of medical reasoning or decision rule, typi- 
cally encoded using the Arden Syntax. 


Medical record committees An institutional 
panel charged with ensuring appropriate use 
of medical records within the organization. 


Medical Subject Headings (MeSH) Some 
18,000 terms used to identify the subject con- 
tent of the biomedical literature. The National 
Library of Medicine’s MeSH vocabulary has 
emerged as the de facto standard for biomedi- 
cal indexing. 


Medication A substance used for medical 
treatment, typically a medicine or drug. 


MEDLARS Online (MEDLINE) The National 
Library of Medicine’s electronic catalog of 
the biomedical literature, which includes 
information abstracted from journal articles, 
including author names, article title, journal 
source, publication date, abstract, and medi- 
cal subject headings. 


Medline Plus An online resource from the 
National Library of Medicine that con- 
tains health topics, drug information, 
medical dictionaries, directories, and other 
resources, organized for use by health care 
consumers, 
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Megabits per second (Mbps) A common unit 
of measure for specifying a rate of data trans- 
mission. 


Megabyte 2?° or 1,048,576 bytes. 


Member checking In subjectivist research, 
the process of reflecting preliminary findings 
back to individuals in the setting under study, 
one way of confirming that the findings are 
truthful. 


Memorandum of understanding A docu- 
ment describing a bilateral or multilateral 
agreement between two or more parties. It 
expresses a convergence of will between the 
parties, indicating an intended common line 
of action. 


Memory Areas that are used to store pro- 
grams and data. The computer’s working 
memory comprises read-only memory (ROM) 
and random access memory (RAM). 


Memory sticks A portable device that typi- 
cally plugs into a computer’s USB port and is 
capable of storing data. Also called a “thumb 
drive” or a “USB drive”. 


Mendelian randomization (MR) A technique 
used to provide evidence for the causality of a 
biomarker on a disease state in conditions in 
which randomized controlled trials are diffi- 
cult or too expensive to pursue. The technique 
uses genetic variants that are known to asso- 
ciate with the biomarker as instrument vari- 
ables. 


Mental images A form of internal represen- 
tation that captures perceptual information 
recovered from the environment. 


Mental models A construct for describing 
how individuals form internal models of sys- 
tems. They are designed to answer questions 
such as “how does it work?” or “what will 
happen if I take the following action?”. 
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Mental representations Internal cognitive 
states that have a certain correspondence with 
the external world. 


Menu In a user interface, a displayed list of 
valid commands or options from which a user 
may choose. 


Merck Medicus An aggregated set of 
resources, including Harrisons Online, 
MD Consult, and DXplain. 


Meta-analysis A summary study that com- 
bines quantitatively the estimates from indi- 
vidual studies. 


Metabolomic Pertaining to the study of 
small-molecule metabolites created as the end- 
products of specific cellular processes. 


Metadata Literally, data about data, describ- 
ing the format and meaning of a set of data. 


Metagenomics Using DNA sequencing 
technology to characterize complex samples 
derived from an environmental sample, e.g., 
microbial populations. For example, the gut 
“microbiome” can be characterized by apply- 
ing next generation sequencing of stool sam- 
ples. 


Metathesaurus One component of the 
Unified Medical Language System, the 
Metathesaurus contains linkages between 
terms in Medical Subject Headings (MeSH) 
and in dozens of controlled vocabularies. 


MIB See Medical Information Bus. 


Microarray chips A microchip that holds 
DNA probes that can recognize DNA from 
samples being tested. 


Microbiome The microorganisms in a par- 
ticular environment (including the body or a 
part of the body) or the combined genomes 
of those organisms. 
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Microprocessor An integrated circuit that 
contains all the functions of a central process- 
ing unit of a computer. 


Microsimulation models Individual-level 
health state transition models that provide a 
means to model very complex events flexibly 
over time. 


MIMIC Il Database See Multiparameter Intel- 
ligent Monitoring in Intensive Care. 


Minicomputers A class of computers that 
were introduced in the 1960s as a smaller 
alternative to mainframe computers. 
Minicomputers enabled smaller companies 
and departments within organizations (like 
HCOs) to implement software applications 
at significantly less cost than was required by 
mainframe computers. 


Mistake Occurs when an inappropriate 
course of action reflects erroneous judgment 
or inference. 


Mixed-initiative dialog A mode of interac- 
tion with a computer system in which the 
computer may pose questions for the user to 
answer, and vice versa. 


Mixed-initiative systems An educational pro- 
gram in which user and program share con- 
trol of the interaction. Usually, the program 
guides the interaction, but the student can 
assume control and digress when new ques- 
tions arise during a study session. 


Mobile health (mHealth) The practice of 
medicine and public health supported by 
mobile devices. Also referred to as mHealth 
or m-health. 


Model organism database Organized refer- 
ence databases that combine bibliographic 
databases, full text, and databases of 
sequences, structure, and function for organ- 
isms whose genomic data has been highly 
characterized, such as the mouse, fruit fly, and 
Sarcchomyces yeast. 


Model organism databases Organized refer- 
ence databases the combine bibliographic data- 
bases, full text, and databases of sequences, 
structure, and function for organ- isms whose 
genomic data has been highly characterized, 
such as the mouse, fruit fly, and Sarcchomyces 
yeast. 


Modem A device used to modulate and 
demodulate digital signals for transmission to 
a remote computer over telephone lines; con- 
verts digital data to audible analog signals, 
and vice versa. 


Modifiers of interest In natural language pro- 
cessing, a term that is used to describe or oth- 
erwise modify a named-entity that has been 
recognized. 


Molecular imaging A technique for capturing 
images at the cellular and subcellular level by 
marking particular chemicals in ways that can 
be detected with image or radiodetection. 


Monitoring tool The application of logi- 
cal rules and conditions (e.g., range-check- 
ing, enforcement of data completion, etc.) 
to ensure the completeness and quality of 
research-related data. 


Monotonic Describes a function that consis- 
tently increases or decreases, rather than oscil- 
lates. 


Morpheme The smallest unit in the grammar 
of a language which has a meaning or a lin- 
guistic function; it can be a root of a word 
(e.g., —arm), a prefix (e.g., re-), or a suffix 
(e.g., —it is). 


Morphology The study of meaningful units 
in language and how they combine to form 
words. 


Morphometrics The quantitative study 
of growth and development, a research 
area that depends on the use of imaging 
methods. 
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Mosaic The first graphical web browser cred- 
ited with popularizing the World Wide Web 
and developed at the National Center for 
Supercomputing Applications (NCSA) at the 
University of Illinois. 


Motion artifact Visual interference caused by 
the difference between the frame rate of an 
imaging device and the motion of the object 
being imaged. 


Mouse A small boxlike device that is moved 
on a flat surface to position a cursor on the 
screen of a display monitor. A user can select 
and mark data for entry by depressing buttons 
on the mouse. 


Multi-axial A terminology system composed 
of several distinct, mutually exclusive term 
sub- sets that care combined to support post- 
coordination. 


Multimodal interface A design concept which 
allows users to interact with computers using 
multiple modes of communication or tools, 
including speaking, clicking, or touchscreen 
input. 


Multiparameter Intelligent Monitoring in 
Intensive Care (MIMIC-II) A publicly and 
freely available research database that encom- 
passes a diverse and very large population of 
ICU patients. It contains high temporal reso- 
lution data including lab results, electronic 
documentation, and bedside monitor trends 
and waveforms. 


Multiprocessing The use of multiple proces- 
sors in a single computer system to increase the 
power of the system (see parallel processing). 


Multiprogramming A scheme by which mul- 
tiple programs simultaneously reside in the 
main memory of a single central processing 
unit. 


Multiprotocol label switching (MPLS) A mecha- 
nism in high-performance telecommunications 
networks that directs data from one network 
node to the next based on short path labels 
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rather than long network addresses, avoiding 
complex lookups in a routing table. 


Multiuser system A computer system 
that shares its resources among multiple 
simultaneous users. 


Mutually exclusive State in which one, and 
only one, of the possible conditions is true; 
for example, either A or not A is true, and one 
of the statements is false. When using Bayes’ 
theorem to perform medical diagnosis, we 
generally assume that diseases are mutually 
exclusive, meaning that the patient has exactly 
one of the diseases under consideration and 
not more. 


Myocardial ischemia Reversible damage to 
cardiac muscle caused by decreased blood 
flow and resulting poor oxygenation. Such 
ischemia may cause chest pain or other symp- 
toms. 


Naive Bayesian model The use of Bayes 
Theorem in a way that assumes conditional 
independence of variables that may in fact be 
linked statistically. 


NAM See: National Academy of Medicine. 


Name Designation of an object by a linguis- 
tic expression. 


Name authority An entity or mechanism for 
controlling the identification and formula- 
tion of unique identifiers for names. In the 
Internet, a name authority is required to asso- 
ciate common domain names with their IP 
addresses. 


Named-entity normalization The natural 
language processing method, after finding a 
named entity in a document, for linking (nor- 
malizing) that mention to appropriate data- 
base identifiers. 


Named-entity recognition In language pro- 
cessing, a subtask of information extraction 
that seeks to locate and classify atomic ele- 
ments in text into predefined categories. 
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Name-server In networked environments 
such as the Internet, a computer that converts 
a host name into an IP address before the 
message is placed on the network. 


National Academies The collective name 
for the National Academy of Engineering, 
National Academy of Medicine and National 
Academy of Sciences which are private, non- 
profit institutions that provide expert advice on 
some of the most pressing challenges facing the 
nation and the world. The work of the National 
Academies helps shape sound policies, inform 
public opinion, and advance the pursuit of sci- 
ence, engineering, and medicine. 


National Academy of Medicine (NAM) An 
independent organization of eminent profes- 
sionals from diverse fields including health 
and medicine; natural, social, and behavioral 
sciences and more. Established in 1970 as the 
Institute of Medicine (IOM), and in 2016 the 
name was changed to the National Academy 
of Medicine (NAM). 


National Center for Biotechnology Information 
(NCBI) Established in 1988 as a national 
resource for molecular biology information, 
the NCBI is a component of the National 
Library of Medicine that creates public data- 
bases, con- ducts research in computational 
biology, develops software tools for analyzing 
genome data, and disseminates biomedical 
information. 


National Committee on Quality Assurance 
(NCQA) An independent 501 nonprofit orga- 
nization in the United States that works 
to improve health care quality through the 
administration of evidence-based standards, 
measures, programs, and accreditation. 


National Guidelines Clearinghouse A pub- 
lic resource, coordinated by the Agency for 
Health Research and Quality, that collects and 
distributes evidence-based clinical practice 
guidelines (see > www.guideline. gov). 


National Health Information Infra-structure 
(NHII) A comprehensive knowledge-based 
network of interoperable systems of clinical, 
public health, and personal health information 
that is intended to improve decision-making 
by making health information available when 
and where it is needed. 


National Health Information Network 
(NHIN) A set of standards, services, and policies 
that have been shepherded by the Office of the 
National Coordinator of Health Information 
Technology to enable secure health informa- 
tion exchange over the Internet. 


National Information Standards Organization 
(NISO) A non-profit association accredited by 
the American National Standards Institute 
(ANSI), that identifies, develops, maintains, 
and publishes technical standards to manage 
information (see > www.niso.org). 


National Institute for Standards and 
Technology (NIST) A non-regulatory fed- 
eral agency within the U.S. Commerce 
Department’s Technology Administration; its 
mission is to develop and promote measure- 
ment, standards, and technology to enhance 
productivity, facilitate trade, and improve the 
quality of life (see » www.nist.gov). 


National Library of Medicine (NLM) The gov- 
ernment-maintained library of biomedicine 
that is part of the US National Institutes of 
Health. 


National Quality Forum A not-for-profit 
organization that develops and implements 
national strategies for health care quality 
measurement and reporting. 


Nationwide Health Information Network 
(NWHIN) A set of standards, services, and 
policies that have been shepherded by the 
Office of the National Coordinator of Health 
Information Technology to enable secure health 
information exchange over the Internet. 
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Natural language Unfettered spoken or writ- 
ten language. Free text. 


Natural language processing (NLP) Facilitates 
tasks by enabling use of automated methods 
that represent the relevant information in the 
text with high validity and reliability. 


Natural language query A question expre- 
ssed in unconstrained text, from which 
meaning must somehow be extracted or 
inferred so that a suitable response can be 
generated. 


Naturalistic Describes a study in which little 
if anything is done by the evaluator to alter 
the setting in which the study is carried out. 


NCBI Entrez global query A search interface 
that allows searching over all data and infor- 
mation resources maintained by NCBI. 


NCI Thesaurus A large ontology developed by 
the National Cancer Institute that describes 
entities related to cancer biology, clinical 
oncology, and cancer epidemiology. 


NCQA See National Committee on Quality 
Assurance. 


Needs assessment A study carried out to help 
understand the users, their context and their 
needs and skills, to inform the design of the 
information resource. 


Negative dictionary A list of stop words used 
in information retrieval. 


Negative predictive value (PV-) The probabil- 
ity that the condition of interest is absent if 
the result is negative—for example, the prob- 
ability that specific a disease is absent given a 
negative test result. 


Negligence theory A concept from tort law 
that states that providers of goods and ser- 
vices are expected to uphold the standards of 
the community, thereby facing claims of neg- 
ligence if individuals are harmed by substan- 
dard goods or services. 
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Nestedstructures In natural language pro- 
cessing, a phrase or phrases that are used in 
place of simple words within other phrases. 


Net reclassification improvement (NRI) In 
classification methods, a measure of the 
net fraction of reclassifications made in the 
correct direction, using one method over 
another method without the designated 
improvement. 


Network access provider A company that 
builds and maintains high speed networks to 
which customers can connect, generally to 
access the Internet (see also Internet service 
provider). 


Network Operations Center (NOC) A central- 
ized monitoring facility for physically distrib- 
uted computer and/or telecommunications 
facilities that allows continuous real-time 
reporting of the status of the connected com- 
ponents. 


Network protocol The set of rules or conven- 
tions that specifies how data are prepared and 
transmitted over a network and that governs 
data communication among the nodes of a 
network. 


Network stack The method within a single 
machine by which the responsibilities for net- 
work communications are divided into differ- 
ent levels, with clear interfaces between the 
levels, thereby making network software more 
modular. 


Neuroinformatics An emerging subarea of 
applied biomedical informatics in which the 
discipline’s methods are applied to the man- 
agement of neurological data sets and the 
modeling of neural structures and function. 


Next Generation Internet Initiative A feder- 
ally funded research program in the late 1990s 
and early in the current decade that sought 
to provide technical enhancements to the 
Internet to support future applications that 
currently are infeasible or are incapable of 
scaling for routine use. 
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Next generation sequencing meth- 
ods Technologies for performing high 


throughput sequencing of large quantities 
of DNA or RNA. Typically, these technolo- 
gies determine the sequences of many mil- 
lions of short segments of DNA that need 
to be reassembled and interpreted using 
bioinformatics. 


NHIN Connect A software solution that 
facilitates the exchange of healthcare infor- 
mation at both the local and national level. 
CONNECT leverages eHealth Exchange 
standards and governance and Direct Project 
specifications to help drive interoperability 
across health information exchanges through- 
out the country. Initially developed by federal 
agencies to support specific healthcare-related 
missions, CONNECT is now available to all 
organizations as downloadable open source 
software. 


NHIN Direct A set of standards and services 
to enable the simple, direct, and secure trans- 
port of health information between pairs of 
health care providers; it is a component of the 
Nationwide Health Information Network and 
it complements the Network’s more sophisti- 
cated components. 

Health Information 


NHIN See: National 


Network. 


Noise The component of acquired data that 
is attributable to factors other than the under- 
lying phenomenon being measured (for exam- 
ple, electromagnetic interference, inaccuracy 
in sensors, or poor contact between sensor 
and source). 


Nomenclature A system of terms used in a 
scientific discipline to denote classifications 
and relationships among objects and pro- 
cesses. 


Nosocomial hospital-acquired infection An 
infection acquired by a patient after 


admission to a hospital for a different reason. 


NQF See: National Quality Forum. 


Nuclear magnetic resonance (NMR) spectros- 
copy A spectral technique used in chemis- 
try to characterize chemical compounds by 
measuring magnetic characteristics of their 
atomic nuclei. 


Nuclear medicine imaging A modality for 
producing images by measuring the radiation 
emitted by a radioactive isotope that has been 
attached to a biologically active compound 
and injected into the body. 


Nursing informatics The application of bio- 
medical informatics methods and techniques 
to problems derived from the field of nursing. 
Viewed as a subarea of clinical informatics. 


NwHIN Direct A set of standards and services 
to enable the simple, direct, and secure trans- 
port of health information between pairs of 
health care providers; it is a component of the 
Nationwide Health Information Network and 
it complements the Network’s more sophisti- 
cated components. 


Nyquist frequency The minimum sampling 
rate necessary to achieve reasonable signal 
quality. In general, it is twice the frequency of 
the highest-frequency component of interest 
in a signal. 


Object Any part of the perceivable or 
conceivable world. 


Object Constraint Language (OCL) A textual 
language for describing rules that apply to 
the elements a model created in the Uniform 
Modeling Language. OLC specifies con- 
straints on allowable values in the model. 
OCL also supports queries of UML models 
(and of models constructed in similar lan- 
guages). OCL is a standard of the Object 
Modeling Group (OMG), and forms the basis 
of the GELLO query language that may be 
used in conjunction with the Arden Syntax. 


Objectivist approaches Class of evaluation 
approaches that make use of experimental 
designs and statistical analyses of quantita- 
tive data. 
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Object-oriented database A database that 
is structured around individual objects 
(concepts) that generally include relation- 
ships among those objects and, in some 
cases, executable code that is relevant to the 
management and or understanding of that 
object. 


Odds-ratio form An algebraic expression for 
calculating the posttest odds of a disease, or 
other condition of interest, if the pretest odds 
and likelihood ratio are known (an alternative 
formulation of Bayes’ theorem, also called the 
odds-likelihood form). 


Office of the National Coordinator for Health 
Information Technology (ONC) An agency 
within the US Department of Health and 
Human Services that is charged with sup- 
porting the adoption of health information 
technology and promoting nationwide health 
information exchange to improve health care. 


Omics A set of areas of study in biology that 
use the suffix “-ome”, used to connote breadth 
or completeness of the objects being studied, 
for example genomics or proteomics. 


-omics technologies High throughput experi- 
mentation that exhaustively queries a certain 
biochemical aspect of the state of an organ- 
ism. Such technologies include proteomics 
(protein), genomics (gene expression), metab- 
olomics (metabolites), etc. 


On line analytic processing (OLAP) A sys- 
tem that focuses on querying across multiple 
patients simultaneously, typically by few users 
for infrequent, but very complex queries, 
often research. 


On line transaction processing (OLTP) A sys- 
tem designed for use by thousands of simulta- 
neous users doing repetitive queries. 


Ontology A description (like a formal speci- 
fication of a program) of the concepts and 
relationships that can exist for an agent or a 
community of agents. In biomedicine, such 
ontologies typically specify the meanings and 
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hierarchical relationships among terms and 
concepts in a domain. 


Open access publishing (OA) An approach 
to publishing where the author or research 
funder pays the cost of publication and 
the article is made freely available on the 
Internet. 


Open consent model A legal mechanism by 
which an individual can disclose their own 
private health information or genetic informa- 
tion for research use. This mechanism is used 
by the Personal Genome Project to enable 
release of entire genomes of identified indi- 
viduals. 


Open source An approach to software devel- 
opment in which programmers can read, 
redistribute, and modify the source code for 
a piece of software, resulting in community 
development of a shared product. 


Open standards development policy In stan- 
dards group, a policy that allows anyone to 
become involved in discussing and defining 
the standard. 


OpenNotes An international movement that 
urges doctors, nurses, therapists, and other 
clinicians to invite patients to read notes that 
clinicians write to describe a visit. OpenNotes 
provides free tools and resources to help clini- 
cians and healthcare systems share notes with 
patients. 


Operating system (OS) A program that allo- 
cates computer hardware resources to user 
programs and that supervises and controls the 
execution of all other programs. 


Optical Character Recognition (OCR) The con- 
version of typed text within scanned docu- 
ments to computer understandable text. 


Optical coherence tomography (OCT) An 
optical signal acquisition and processing 
method. It captures micrometer-resolution, 
three-dimensional images from within optical 
scattering media (e.g., biological tissue). 
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Optical disk A round, flat plate of plastic or 
metal that is used to store information. Data 
are encoded through the use of a laser that 
marks the surface of the disc. 


Order entry The use of a computer system for 
entering treatments, requests for lab tests or 
radiologic studies, or other interventions that 
the attending clinician wishes to have per- 
formed for the benefit of a patient. 


Orienting issues/questions The initial ques- 
tions or issues that evaluators seek to answer 
in a subjectivist study, the answers to which 
often in turn prompt further questions. 


Outcome data Formal information regarding 
the results of interventions. 


Outcome measurements Using metrics that 
assess the end result of an intervention rather 
than an intervening process. For example, 
remembering to check a patient’s Hemoglobin 
AIC is a process measure, whereas reducing 
the complications of diabetes is an outcome 
measure. 


Outcome variable Similar to “dependent 
variable,” a variable that captures the end 
result of a health care or educational process; 
for example, long-term operative complica- 
tion rate or mastery of a subject area. 


Outpatient A patient seen in a clinic rather 
than in the hospital setting. 


Output The results produced when a process 
is applied to input. Some forms of output are 
hardcopy documents, images displayed on 
video display terminals, and calculated values 
of variables. 


P4 medicine P4 medicine: a term coined by 
Dr. Leroy Hood for healthcare that strives 
to be personalized, predictive, preventive and 
participatory. 


Packets In networking, a variable-length 
message containing data plus the network 
addresses of the sending and receiving nodes, 
and other control information. 


Page A partitioned component of a com- 
puter users’ programs and data that can be 
kept in temporary storage and brought into 
main memory by the operating system as 
needed. 


Pager One of the first mobile devices for elec- 
tronic communication between a base station 
(typically a telephone, but later a computer) 
and an individual person. Initially restricted 
to receiving only numeric data (e.g., a tele- 
phone number), pagers later incorporated the 
ability to transmit a response (referred to as 
“two way pagers”) as well as alpha characters 
so that a message of limited length could be 
transmitted from a small keyboard. Pagers 
have been gradually replaced by cellular 
phones because of their greater flexibility and 
broader geographical coverage. 


PageRank (PR) algorithm In indexing for 
information retrieval on the Internet, an algo- 
rithmic scheme for giving more weight to a 
Web page when a large number of other pages 
link to it. 


Parallel processing The use of multiple pro- 
cessing units running in parallel to solve a 
single problem (see multiprocessing). 


Parse tree The representation of structural 
relationships that results when using a gram- 
mar (usually context free) to analyze a given 
sentence. 


Partial parsing The analysis of structural rela- 
tionships that results when using a grammar 
to analyze a segment of a given sentence. 


Partial-match searching An approach 
to information retrieval that recognizes 
the inexact nature of both indexing and 
retrieval, and attempts to return the user 
content ranked by how close it comes to the 
user’s query. 


Participant calendaring Participant calendar- 
ing refers to the capability of a CRMS to sup- 
port the tracking of participant compliance with 
a study schema, usually represented as a calen- 
dar of temporal events. 
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Participant screening and registration partic- 
ipant screening and registration refers to the 
capability of a CTMS to support the enroll- 
ment phase of a clinical study. 


Participants The people or organizations who 
provide data for the study. According to the 
role of the information resource, these may 
include patients, friends and family, formal 
and informal carers, the general public, health 
professionals, system developers, guideline 
developers, students, health service managers, 
etc. 


Part-of-speech tags Assignment of syntac- 
tic classes to a given sequence of words, e.g., 
determiner, adjective, noun and verb. 


Partsofspeech The categories to which words 
in a sentence are assigned in accordance with 
their syntactic function. 


Patent A specific legal approach for protect- 
ing methods used in implementing or instanti- 
ating ideas (see intellectual property). 


Pathognomonic Distinctively characteristic, 
and thus, uniquely identifying a condition or 
object (100% specific). 


Patient centered care Clinical care that is 
based on personal characteristics of the 
patient in addition to his or her disease. Such 
characteristics include cultural traditions, 
preferences and values, family situations and 
lifestyles. 


Patient centered medical home A team-based 
health care delivery model led by a physician, 
physician’s assistant, or nurse practitioner that 
provides comprehensive, coordinated, and 
continuous medical care to patients with the 
goal of obtaining maximized health outcomes. 


Patient engagement Participation of a 
patient as an active collaborator in his or her 
health care process. 


Patient generated health data Health-related 
data that are recorded or collected directly by 
patients. 
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Patient portal An online application that 
allows individuals to view health information 
and otherwise interact with their physicians 
and hospitals. 


Patient record The collection of informa- 
tion traditionally kept by a health care pro- 
vider or organization about an individual’s 
health status and health care; also referred 
to as the patient’s chart, medical record, or 
health record, and originally called the “unit 
record”. 


Patient safety The reduction in the risk of 
unnecessary harm associated with health care 
to an acceptable minimum; also the name of a 
movement and specific research area. 


Patient triage The process of allocating 
patients to different levels or urgency of care 
depending upon the complaints or symptoms 
displayed. 


Patient-specific information Information 
derived and organized from a specific patient. 


Patient-tracking applications Monitor pat- 
ient movement in multistep processes. 


Pattern check A procedure applied to entered 
data to verify that the entered data have a 
required pattern; e.g., the three digits, hyphen, 
and four digits of a local telephone number. 


Pay for performance Payments to providers 
that are based on meeting pre-defined expec- 
tations for quality. 


Per diem Payments to providers (typically 
hospitals) based on a single day of care. 


Perimeter definition Specification of the 
boundaries of trusted access to an informa- 
tion system, both physically and logically. 


Personal clinical electronic communica- 
tion Web-based messaging solutions that 
avoid the limitations of email by keeping all 
interactions within a secure, online environ- 
ment. 
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Personal computers A small, relatively inex- 
pensive, single-user computer. 


Personal Digital Assistants (PDA) A small, 
mobile, handheld device that provides com- 
puting and information storage and retrieval 
capabilities for personal or business use. PDAs 
can typically run third-party applications. 


Personal grid architecture A security meth- 
odology that prevents large-scale data loss 
from a central repository by separately storing 
and encrypting each person’s records. While 
searching across records must be sequential, 
reasonable response times can be achieved by 
massive parallelization of the search process 
in the cloud. 


Personal health application Software for 
computers, tablet computers, or smart phones 
that are intended to allow individual patients 
to monitor their own health or to stimulate 
their own personal health activities. 


Personal health informatics The area of bio- 
medical informatics based on patient-centered 
care, in which people are able to access care 
that is coordinated and collaborative. 


Personal health record (PHR) A collection of 
information about an individual’ health sta- 
tus and health care that is maintained by the 
individual (rather than by a health care pro- 
vider); the data may be entered directly by the 
patient, captured from a sensing device, or 
transferred from a laboratory or health care 
provider. It may include medical information 
from several independent provider organi- 
zations, and may also have health and well- 
being information. 


Personal Internetworked Notary and Guardian 
(PING) An early personally controlled health 
record, later known as Indivo. 


Personalized medicine Also often call indi- 
vidualized medicine, refers to amedical model 
in which decisions are custom-tailored to the 
patient based on that individual’s genomic 
data, preferences, or other considerations. 
Such decisions may involve diagnosis, treat- 


ment, or assessments of prognosis. Also 
known as precision medicine. 


Personally controlled health record 
(PCHR) Similar to a PHR, the PCHR differs 
in the nature of the control offered to the 
patient, with such features as semantic tags on 
data elements that can be used to determine 
the subsets of information that can be shared 
with specific providers. 


Petabyte A unit of information equal to 1000 
terabytes or 10!5 bytes. 


Pharmacodynamics program (PD) The study 
of how a drug works, it’s mechanism of action 
and pathway of achieving its affect, or “what 
the drug does to the body”. 


Pharmacogenetics The study of drug-gene rela- 
tionships that are dominated by a single gene. 


Pharmacogenomic variant A particular 
genetic variant that affects a drug-genome 
interaction. 


Pharmacokinetic program Pharmacokinetics 
or PK is the study of how a drug is absorbed, 
distributed, metabolized and excreted by the 
body, or “what the body does to the drug”. 


Pharmacovigilance The pharmacological sci- 
ence relating to the collection, detection, assess- 
ment, monitoring, and prevention of adverse 
effects with pharmaceutical products. 


Phase In the context of clinical research, study 
phases are used to indicate the scientific aim of 
a given clinical trial. There are 4 phases (Phase 
I, Phase II, Phase III, and Phase IV). 


Phase | (clinical trial) Investigators evaluate 
a novel therapy in a small group of partici- 
pants in order to assess overall safety. This 
safety assessment includes dosing levels in 
the case of non-interventional therapeutic 
trials, and potential side effects or adverse 
effects of the therapy. Often, Phase I trials of 
non-interventional therapies involve the use 
of normal volunteers who do not have the 
disease state targeted by the novel therapy. 
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Phase II (clinical trial) Investigators evaluate a 
novel therapy in a larger group of participants 
in order to assess the efficacy of the treatment 
in the targeted disease state. During this phase, 
assessment of overall safety is continued. 


Phase III (clinical trial) Investigators evaluate 
a novel therapy in an even larger group of 
participants and compare its performance to 
a reference standard which is usually the cur- 
rent standard of care for the targeted disease 
state. This phase typically employs a random- 
ized controlled design, and often a multi-cen- 
ter RCT given the numbers of variation of 
subjects that must be recruited to adequately 
test the hypothesis. In general, this is the final 
study phase to be performed before seeking 
regulatory approval for the novel therapy 
and broader use in standard- of-care environ- 
ments. 


Phase IV (clinical trial) Investigators study the 
performance and safety of a novel therapy 
after it has been approved and marketed. This 
type of study is performed in order to detect 
long-term outcomes and effects of the ther- 
apy. It is often called “post-market surveil- 
lance” and is, in fact, not an RCT at all, but a 
less formal, observational study. 


PheKB.org A web site that houses EHR-based 
algorithms for determining phenotypes. 


Phenome characterization Identification of 
the individual traits of an organism that char- 
acterize its phenotype. 


Phenome-wide association scan A study that 
derives case and controls populations using 
the EMR to define clinical phenotypes and 
then examines the association of those phe- 
notypes with specific genotypes. 


Phenome-wide association study (PheWAS) A 
study that tests for association between a par- 
ticular genetic variant and a large number of 
phenotypic characteristics. 


Phenotype The observable physical charac- 
teristics of an organism, produced by the inter- 
action of genotype with environment. 
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Phenotype definition The process of deter- 
mining the set of observable descriptors that 
characterize an organism’s phenotype. 


Phenotype risk score (PheRS) A calculation 
of the likelihood of a particular genetic vari- 
ant being present based on a weighed score of 
one or more phenotypic characteristics. 


Phenotypic Refers to the physical character- 
istics or appearance of an organism. 


Picture Archive and Communication Systems 
(PACS) An integrated computer system that 
acquires, stores, retrieves, and displays digital 
Images. 


Pixel One of the small picture elements that 
makes up a digital image. The number of pix- 
els per square inch determines the spatial res- 
olution. Pixels can be associated with a single 
bit to indicate black and white or with mul- 
tiple bits to indicate color or gray scale. 


Placebo In the context of clinical research, 
a placebo is a false intervention (e.g., a mock 
intervention given to a participant that resem- 
bles the intervention experienced by individu- 
als receiving the experimental intervention, 
except that it has no anticipated impact on the 
individual’s health or other indicated status), 
usually used in the context of a control group 
or intervention. 


Plain old telephone service (POTS) The stan- 
dard low speed, analog telephone service 
that is still used by many homes and busi- 
nesses. 


Plastination A method of embalming a part 
of a human body using plastic to suffuse 
human tissue. 


Plug-in A software component that is added 
to web browsers or other programs to allow 
them a special functionality, such as an ability 
to deal with certain kinds of media (e.g., video 
or audio). 


Pointing device A manual device, such as a 
mouse, light pen, or joy stick, that can be used 
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to specify an area of interest on a computer 
screen. 

Polygenic risk score (PRS) See Genetic risk 
score. 


Population Health is not universally defined 
but is a commonly used term to organize 
activities performed by private or public enti- 
ties for assessing, managing, and improving 
the well-being and health outcomes of a 
defined group of individuals. Population may 
be defined by a specific geographic community 
or region; enrollees of a health plan; persons 
residing in a health systems catchment area; 
or an aggregation of individuals with specific 
conditions. Population health is based on the 
underlying assumption that multiple common 
factors impact the health and well-being of 
specific populations, and that focused inter- 
ventions early in the causal chain of disease 
may save resources, and prevent morbidity 
and mortality. 


Population management Health care prac- 
tices that assist with a large group of people, 
including preventive medicine and immuni- 
zation, screening for disease, and prioritiza- 
tion of interventions based on community 
needs. 


Positive predictive value (PV+) The probabil- 
ity that the condition of interest is true if the 
result is positive—for example, the probability 
that a disease is present given a positive test 
result. 


Positron emission tomography A tomo- 
graphic imaging method that measures the 
uptake of various metabolic products (gener- 
ally a com- bination of a positron-emitting 
tracer with a chemical such as glucose), e.g., 
by the functioning brain, heart, or lung. 


Postcoordination The combination of two 
or more terms from one or more terminolo- 
gies to create a phrase used for coding data; 
for example, “Acute Inflammation” and 
“Appendix” combined to code a patient with 
appendicitis. See also, precoordination. 


Posterior probability The updated probability 
that the condition of interest is present after 
additional information has been acquired. 


Postgenomic database A database that 
com- bines molecular and genetic informa- 
tion with data of clinical importance or rel- 
evance. Online Mendelian Inheritance in Man 
(OMIM) is a frequently cited example of such 
a database. 


Post-test probability The updated probabil- 
ity that the disease or other condition under 
consideration is present after the test result is 
known (more generally, the posterior prob- 
ability). 


Practice management system The software 
used by physicians for scheduling, registra- 
tion, billing, and receivables management in 
their offices. May increasingly be linked to an 
EHR. 


Pragmatics The study of how contextual 
information affects the interpretation of the 
underlying meaning of the language. 


Precision The degree of accuracy with which 
the value of a sampled observation matches 
the value of the underlying condition, or the 
exactness with which an operation is per- 
formed. In information retrieval, a measure 
of a system’s performance in retrieving rele- 
vant information (expressed as the fraction of 
relevant records among total records retrieved 
in a search). 


Precision Medicine The application of spe- 
cific diagnostic and therapeutic methods 
matched to an individual based on highly 
unique information about the individual, such 
as their genetic profile or properties of their 
tumor. 


Precoordination A complex phrase in a ter- 
minology that can be constructed from mul- 
tiple terms but is, itself, assigned a unique 
identifier within the terminology; for example, 
“Acute Inflammation of the Appendix.” See 
also, postcoordination. 
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Predatory journal A name given to journals 
that publish under the OA model and have no 
to minimal peer review of submitted papers. 


Predicate The part of a sentence or clause 
containing a verb and stating something 
about the subject. 


Predicate logic In mathematical logic, the 
generic term for symbolic formal systems like 
first-order logic, second-order logic, etc. 


Predictive value (PV) The posttest probability 
that a condition is present based on the results 
of a test (see positive predictive value and neg- 
ative predictive value). 


Preparatory phase In the preparatory phase 
of a clinical research study, investigators 
are involved in the initial design and docu- 
mentation of a study (developing a protocol 
document), prior to the identification and 
enrollment of study participants. 


President's Emergency Plan for AIDS Relief 
(PEPFAR) The United States government’s 
response to the global HIV/AIDS epidemic, 
and represents the largest commitment by any 
nation to address a single disease in history. 
PEPFAR is intended to save and improved 
millions of lives, accelerating progress toward 
controlling and ultimately ending the AIDS epi- 
demic as a public health threat. PEPFAR col- 
lects and uses data in the most granular manner 
(disaggregated by sex, age, and at the site level) 
to do the right things, in the right places, and 
right now within the highest HIV-burdened 
populations and geographic locations. 


Pretest probability The probability that the 
dis- ease or other condition under consider- 
ation is present before the test result is known 
(more generally, the prior probability). 


Prevalence The frequency of the condition 
under consideration in the population. For 
example, we calculate the prevalence of dis- 
ease by dividing the number of diseased indi- 
viduals by the number of individuals in the 
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population. Prevalence is the prior probability 
of a specific condition (or diagnosis), before 
any other information is available. 


Primary knowledge-based information The 
original source of knowledge, generally in a 
peer reviewed journal article that reports on a 
research project’s results. 


Prior probability The probability that the 
condition of interest is present before addi- 
tional information has been acquired. In a 
population, the prior probability also is called 
the prevalence. 


Privacy A concept that applies to people, 
rather than documents, in which there is a 
presumed right to protect that individual from 
unauthorized divulging of personal data of 
any kind. 


Probabilistic context free grammar A context 
free grammar in which the possible ways to 
expand a given symbol have varying prob- 
abilities rather than equal weight. 


Probabilistic relationship Exists when the 
occurrence of one chance event affects the 
probability of the occurrence of another 
chance event. 


Probabilisticsensitivityanalysis An approach 
for understanding how the uncertainty in 
all (or a large number of) model param- 
eters affects the conclusion of a decision 
analysis. 


Probability Informally, a means of expressing 
belief in the likelihood of an event. Probability 
is more precisely defined mathematically in 
terms of its essential properties. 


Probalistic causal network Also known as a 
Bayesian network, a statistical model built 
of directed acyclic graph structures (nodes) 
that are connected through relationships 
(edges). The strength of each of the relation- 
ships is defined through conditional prob- 
abilities. 
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Probes Genetic markers used it genetic assays 
to determine the presences or absence of a 
particular variant. 


Problem impact study A study carried out in 
the field with real users as participants and 
real tasks to assess the impact of the informa- 
tion resource on the original problem it was 
designed to resolve. 


Problem space The range of possible solu- 
tions to a problem. 


Problem-based learning Small groups of 
students, supported by a facilitator, learned 
through discussion of individual case scenar- 
108. 


Procedural knowledge Knowledge of how to 
perform a task (as opposed to factual knowl- 
edge about the world). 


Procedure An action or intervention under- 
taken during the management of a patient 
(e.g., Starting an IV line, performing surgery). 
Procedures may also be cognitive. 


Procedure trainer (Also Part-task trainer). 
An on-screen simulation of a surgical or other 
procedure that is controlled using physical 
tools such as an endoscope. It allows repeated 
practice of a specific skill. 


Process integration An organizational analy- 
sis methodology in which a series of tasks 
are reviewed in terms of their impact on each 
other rather than being reviewed separately. In 
a hospital setting, for example, a process inte- 
gration view would look at patient registra- 
tion and scheduling as an integrated workflow 
rather than as separate task areas. The goal 
is to achieve greater efficiency and effective- 
ness by focusing on how tasks can better work 
together rather than optimizing specific areas. 


Prodrug A chemical that requires transfor- 
mation in vivo (typically by enzymes) to pro- 
duce its active drug. 


Product An object that goes through the pro- 
cesses of design, manufacture, distribution, 
and sale. 


Prognostic scoring system An approach to 
prediction of patient outcomes based on for- 
mal analysis of current variables, generally 
through methods that compare the patient 
in some way with large numbers of similar 
patients from the past. 


Progressive caution The idea that reason, 
caution and attention to ethical issues must 
govern research and expanding applications 
in the field of biomedical informatics. 


Propositions An expression, generally in lan- 
guage or other symbolic form, that can be 
believed, doubted, or denied or is either true 
or false. 


Prospective study An experiment in which 
researchers, before collecting data for analy- 
sis, define study questions and hypotheses, the 
study population, and data to be collected. 


Prosthesis A device that replaces a body 
part—e.g., artificial hip or heart. 


Protected memory An segment of computer 
memory that cannot be over-written by the 
usual means. 


Protein Data Bank (PDB) A centralized reposi- 
tory of experimentally determined three 
dimensional protein and nucleic acid struc- 
tures. 


Proteomics The study of the protein products 
produced by genes in the genome. 


Protocol A standardized method or app- 
roach. 


Protocol analysis In cognitive psychology, 
methods for gathering and interpreting data 
that are presumed to reveal the mental pro- 
cesses used during problem solving (e.g., anal- 
ysis of “think-aloud” protocols). 


Protocol authoring tools A software product 
used by researchers to construct a description 
of a study’s rationale, guidelines, endpoints, 
and the like. Such descriptions may be struc- 
tured formally so that they can be manipu- 
lated by trial management software. 
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Protocol management Protocol management 
refers to the capability of aCRMS to support 
the preparatory phase of a clinical study. 


Provider-profiling system Software that uti- 
lizes available data sources to report on pat- 
terns of care by one or several providers. 


Pseudo-identifier A unique identifier substi- 
tuted for the real identifier to mask the iden- 
tify but can under certain circumstances allow 
linking back to the original person identifier 
if needed. 


Public health The field that deals with moni- 
toring and influencing trends in habits and 
disease in an effort to protect or enhance the 
health of a population, from small communi- 
ties to entire countries. 


Public health informatics An application area 
of biomedical informatics in which the field’s 
methods and techniques are applied to prob- 
lems drawn from the domain of public health. 


Public health informatics The systematic 
application of informatics methods and tools 
to support public health goals and outcomes, 
regardless of the setting. 


Public Health Surveillance The ongoing sys- 
tematic collection, analysis, and interpreta- 
tion of data (e.g., regarding agent/hazard, risk 
factor, exposure, health event) essential to the 
planning, implementation, and evaluation 
of public health practice, closely integrated 
with the timely dissemination of these data to 
those responsible for prevention and control. 
> http://www.aphl.org/Pages/default.aspx. 
Also see Biosurveillance and Surveillance. 


Public Library of Science (PLoS) A family of 
scientific journals that is published under the 
open-access model. 


Publication type One of several classes of 
articles or books into which a new publica- 
tion will fall (e.g., review articles, case reports, 
original research, textbook, etc.). 
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Public-key cryptography In data encryp- 
tion, a method whereby two keys are used, 
one to encrypt the information and a second 
to decrypt it. Because two keys are involved, 
only one needs be kept secret. 


Public-private keys A pair of sequences of 
characters or digits used in data encryption 
in which one is kept private and the other is 
made public. A message encrypted with the 
public key can only be opened by the holder 
of the private key, and a message signed with 
the private key can be verified as authentic by 
anyone with the public key. 


PubMed A software environment for search- 
ing the Medline database, developed as part 
of the suite of search packages, known as 
Entrez, by the NLM’s National Center for 
Biotechnology Information (NCBI). 


PubMed Central (PMC) An effort by the 
National Library of Medicine to gather the 
full-text of scientific articles in a freely accessi- 
ble database, enhancing the value of Medline 
by providing the full articles in addition to 
titles, authors, and abstracts. 


QRS wave In an electrocardiogram (ECG), 
the portion of the wave form that represents 
the time it takes for depolarization of the ven- 
tricles. 


Quality assurance A means for monitoring 
and maintaining the goodness of a service, 
product, or process. 


Quality Data Model An information model 
that describes the relationships between patient 
data and clinical concepts in a standardized 
format. The model was originally proposed 
to enable electronic quality-performance 
measurement and it is now aligned with CDS 
standards. 


Quality management A specific effort to let 
quality of care be the goal that determines 
changes in processes, staffing, or investments. 
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Quality measurements Numeric metrics 
that assess the quality of health care ser- 
vices. Examples of quality measures include 
the portion of a physician’s patients who are 
screened for breast cancer and 30-day hospital 
readmission rates. These measurements have 
tradition- ally been derived from administra- 
tive claims data or paper charts but there is 
increasing interest in using clinical data form 
electronic sources. 


Quality-adjusted life year (QALY) A mea- 
sure of the value of a health outcome that 
reflects both longevity and morbidity; it is 
the expected length of life in years, adjusted 
to account for diminished quality of life due 
to physical or mental disability, pain, and 
so on. 


Quasi-experiments A quasiexperiment is a 
non-randomized, observational study design 
in which conclusions are drawn from the 
evaluation of naturally occurring and non- 
controlled events or cases. 


Query The ability to extract information from 
an EHR based on a set of criteria; e.g., one 
could query for all patients with diabetes who 
have missed their follow-up appointments. 


Query and Reporting Tool Software that sup- 
ports both the planned and ad-hoc extraction 
and aggregation of data sets from multiple 
data forms or equivalent data capture instru- 
ments used within a clinical trials manage- 
ment system. 


Query-response cycle For a database system, 
the process of submitting a single request for 
information and receiving the results. 


Question answering (QA) A computer-based 
process whereby a user submits a natural 
language question that is then automatically 
answered by returning a specific response (as 
opposed to returning documents). 


Question understanding A form of natural 
language understanding that supports com- 
puter-based question answering. 


Radiology The medical field that deals with 
the definition of health conditions through 
the use of visual images that reflect informa- 
tion from within the human body. 


Radiology Information System (RIS) Com- 
puter-based information system that supports 
radiology department operations; includes 
management of the film library, scheduling 
of patient examinations, reporting of results, 
and billing. 


Random-access memory (RAM) The portion 
of a computer’s working memory that can 
be both read and written into. It is used to 
store the results of intermediate computation, 
and the programs and data that are currently 
in use (also called variable memory or core 
memory). 


Randomized clinical trial (RCT) A prospective 
experiment in which subjects are randomly 
assigned to study subgroups to compare the 
effects of alternate treatments. 


Randomly Without bias. 


Range check A procedure applied to entered 
data that detects or prevents entry of values 
that are out of range; e.g., a serum potassium 
level of 50.0 mmol/L—the normal range for 
healthy individuals is 3.5-5.0 mol/L. 


Ransomware Malicious software that blocks 
access to a computer system or its data until a 
sum of many is paid to the perpetrators. 


Read-only memory (ROM) The portion of a 
computer’s working memory that can be read, 
but not written into. 


Really simple syndication (RSS) A form of 
XML that publishes a list of headlines, article 
titles or events encoded in a way that can be 
easily read by another program called a news 
aggregator or news reader. 


Real-time acquisition The continuous measure- 
ment and recording of electronic signals through 
a direct connection with the signal source. 
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Real-time feedback This is feedback to the 
learner in response to each action taken by the 
learner. Real time feedback is particularly use- 
ful in the initial steps of learning a topic. As 
the learner becomes more experienced with a 
topic, real time feedback is often withdrawn 
and summative feedback is provided at the 
end of a session. 


Recall In information retrieval, the ability 
of a system to retrieve relevant information 
(expressed as the ratio of relevant records 
retrieved to all relevant records in the data- 
base). 


Receiver In data interchange, the program 
or system that receives a transmitted mes- 
sage. 


Receiver operating characteristic (ROC) A 
graphical plot that depicts the performance of 
a binary classifier system as its discrimination 
threshold is varied. 


Records In a data file, a group of data fields 
that collectively represent information about 
a single entity. 


Reductionist approaches An attempt to 
explain phenomena by reducing them to com- 
mon, and often simple, first principles. 


Reductionist biomedical model A model of 
medical care that emphasizes pathophysiology 
and biological principles. The model assumes 
that diseases can be understood purely in 
terms of the component biological processes 
that are altered as a consequence of illness. 


Reference Information Model (RIM) The 
data model for HL7 Version 3.0. The RIM 
describes the kinds of information that may be 
transmitted within health-care organizations, 
and includes acts that may take place (proce- 
dures, observations, interventions, and so on), 
relationships among acts, the manner in which 
health-care personnel, patients, and other 
entities may participate in such acts, and the 
roles that can be assumed by the participants 
(patient, provider, specimen, and so on). 
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Reference resolution In NLP, recognizing 
that two mentions in two different textual 
locations refer to the same entity. 


Reference standard See gold standard test. 


Referential expression A sequence of one or 
more words that refers to a particular person, 
object or event, e.g., “she,” “Dr. Jones,” or 
“that procedure”. 


Referral bias In evaluation studies, a bias 
that is introduced when the patients entering 
a study are in some way atypical of the total 
population, generally because they have been 
referred to the study based on criteria that 
reflect some kind of bias by the referring phy- 
sicians. 


Region of interest (ROI) A selected subset of 
pixels within an image identified for a particu- 
lar purpose. 


Regional Extension Centers (RECs) In the 
con- text of health information technology, 
the 60+ state and local organizations (ini- 
tially funded by ONC) to help primary care 
providers in their designated area adopt and 
use EHRs through out-reach, education, and 
technical assistance. 


Regional Health Information Organization 
(RHIO) A community-wide, multi-stakeholder 
organization that utilizes information tech- 
nology to make more complete patient infor- 
mation and decision support available to 
authorized users when and where needed. 


Regional network A network that provides 
regional access from local organizations and 
individuals to the major backbone networks 
that interconnect regions. 


Registers In a computer, a group of elec- 
tronic switches used to store and manipulate 
numbers or text. 


Registry A data system designed to record 
and store information about the health status 
of patients, often including the care that they 
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receive. Such collections are typically orga- 
nized to include patients with a specific dis- 
ease or class of diseases. 


Regular expression A mathematical model 
of a set of strings, defined using characters of 
an alphabet and the operators concatenation, 
union and closure (zero Or more occurrences 
of an expression). 


Regulated Clinical Research Information 
Management (RCRIM) An HL7 workgroup 
that is developing standards to improve infor- 
mation management for preclinical and clini- 
cal research. 


Relations among named entities The charac- 
terization of two entities in NLP with respect 
to the semantic nature of the relationship 
between them. 


Relative recall An approach to measuring 
recall when it is unrealistic to enumerate all 
the relevant documents in a database. Thus 
the denominator in the calculation of recall is 
redefined to represent the number of relevant 
documents identified by multiple searches on 
the query topic. 


Relevance judgment In the context of infor- 
mation retrieval, a judgment of which docu- 
ments should be retrieved by which topics in a 
test collection. 


Relevance ranking The degree to which the 
results are relevant to the information need 
specified in a query. 


Reminder message A computer-generated 
warning that is generated when a record meets 
prespecified criteria, often referring to an 
action that is expected but is frequently for- 
gotten; e.g., a message that a patient is due for 
an immunization. 


Remote access Access to a system or to infor- 
mation therein, typically by telephone or 
communications network, by a user who is 
physically removed from the system. 


Remote Intensive Care Use of networked 
communications methods to monitor patients 
in an intensive care unit from a distance far 
removed from the patients themselves. See 
remote monitoring. 


Remote interpretation Evaluating tests (espe- 
cially imaging studies) by having them deliv- 
ered digitally to a location that may be far 
removed from the patient. 


Remote monitoring The use of electronic 
devices to monitor the condition of a patient 
from a distant location. Typically used to refer 
to the ability to record and review patient data 
(such as vital signs) by a physician located in his/ 
her office or a hospital while the patient remains 
at home. See also remote intensive care. 


Remote-presence health care The use of 
video teleconferencing, image transmission, 
and other technologies that allow clinicians to 
evaluate and treat patients in other than face- 
to-face situations. 


Report generation A mechanism by which 
users specify their data requests on the input 
screen of a program that then produces the 
actual query, using information stored in a data- 
base schema, often at predetermined intervals. 


Representation A level of medical data 
encoding, the process by which as much detail 
as possible is coded. 


Representational effect The phenomenon by 
which different representations of a common 
abstract structure can have a significant effect 
on reasoning and decision making. 


Representational state A particular configu- 
ration of an information-bearing structure, 
such as a monitor display, a verbal utterance, 
or a printed label, that plays some functional 
role in a process within the system. 


Representativeness A heuristic by which a 
person judges the chance that a condition 
is true based on the degree of similarity 
between the current situation and the ste- 
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reotypical situation in which the condition is 
true. For example, a physician might estimate 
the probability that a patient has a particu- 
lar disease based on the degree to which the 
patient’s symptoms matches the classic dis- 
ease profile. 


Request for Proposals A formal notification 
of a funding opportunity, requiring applica- 
tion through submission of a grant proposal. 


Research protocol In clinical research, a 
prescribed plan for managing subjects that 
describes what actions to take under specific 
conditions. 


Resource Description Framework (RDF) An 
emerging standard for cataloging metadata 
about information resources (such as Web 
pages) using the Extensible Markup Language 
(XML). 


RESTful API A “lightweight” application pro- 
gramming interface that enables the transfer 
of data between two Web-based software sys- 
tems. 


Results reporting A software system or sub- 
system used to allow clinicians to access the 
results of laboratory, radiology, and other 
tests for a patient. 


Retrieval A process by which queries are 
com- pared against an index to create 
results for the user who specified the query. 


Retrospective chart review The use of past 
data from clinical charts (classically paper 
records) of selected patients in order to per- 
form research regarding a clinical question. 
See also retrospective study. 


Retrospective study A research study per- 
formed by analyzing data that were previously 
gathered for another purpose, such as patient 
care. See also retrospective chart review. 


Return on investment A metric for the ben- 
efits of an investment, equal to the net benefits 
of an investment divided by its cost. 
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Review of systems The component of a 
typical history and physical examination in 
which the physician asks general questions 
about each of the body’s major organ sys- 
tems to discover problems that may not have 
been suggested by the patient’s chief com- 
plaint. 


RFP See: Request for Proposals. 


Ribonucleic acid (RNA) Ribonucleic acid, a 
nucleic acid present in all living cells. Its prin- 
cipal role is to act as a messenger carrying 
instructions from DNA in the production of 
proteins. 


Rich text format (RTF) A format developed to 
allow the transfer of graphics and formatted 
text between different applications and oper- 
ating systems. 


RIM See Reference Information Model. 


Risk attitude A person’s willingness to take 
risks. 


Risk-neutral Having the characteristic of 
being indifferent between the expected value 
of a gamble and the gamble itself. 


Role-limited access The mechanism by which 
an individual’s access to information in a 
database, such as a medical record, is limited 
depending upon that user’s job characteristics 
and their need to have access to the informa- 
tion. 


Router/switch In networking, a device that 
sits on the network, receives messages, and 
for- wards them accordingly to their intended 
destination. 


RS-232 A commonly used standard for serial 
data communication that defines the number 
and type of the wire connections, the volt- 
age, and the characteristics of the signal, 
and thus allows data communication among 
electronic devices produced by different 
manufacturers. 
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RSS feed A bliographic message stream that 
provides content from Internet sources. 


Rule engine A software component that 
implements an inference engine that operates 
on production rules. 


Rule-based system A kind of knowledge- 
based system that performs inference using 
production rules. 


Sampling rate The rate at which the continu- 
ously varying values of an analog signal are 
measured and recorded. 


Scenario A method of teaching that presents 
a clinical problem in a story format. 


Schema In a database-management system, 
a machine-readable definition of the contents 
and organization of a database. 


Schemata Higher-level kinds of knowledge 
structures. 


SCORM Shareable Content Object Reference 
Model, a standard for interoperability 
between learning content objects. 


Script In software systems, a keystroke-by- 
keystroke record of the actions performed for 
later reuse. 


SDO See: Standards development organiza- 
tions. 


Search A synonym for information retrieval. 
Search See Information retrieval. 


Search engine A computer system that 
returns content from a search statement 
entered by a user. 


Secondary knowledge-based informa- 
tion Writing that reviews, condenses, and/ 
or synthesizes the primary literature (see pri- 
mary knowledge-based information). 


Secret-key cryptography In data encryption, a 
method whereby the same key is used to encrypt 


and to decrypt information. Thus, the key must 
be kept secret, known to only the sender and 
intended receiver of information. 


Secure Sockets Layer (SSL) A protocol for 
transmitting private documents via the 
Internet. It has been replaced by Transport 
Layer Security. By convention, URLs that 
require an SSL connection start with https: 
instead of http: 


Security The process of protecting informa- 
tion from destruction or misuse, including 
both physical and computer-based mecha- 
nisms. 


Segmentation In image processing, the 
extraction of selected regions of interest from 
an image using automated or manual tech- 
niques. 


Selectivity In data collection and recording, 
the process that accounts for individual styles, 
reflecting an ongoing decision-making pro- 
cess, and often reflecting marked distinctions 
among clinicians. 


Self-experimentation Experiments in which 
experimenters themselves are subjects of their 
research. 


Semantic analysis The study of how symbols 
or signs are used to designate the meaning of 
words and the study of how words combine to 
form or fail to form meaning. 


Semantic class In NLP, a broad class that is 
associated with a specific domain and includes 
many instances. 


Semantic grammar A mathematical model of 
a set of sentences based on patterns of seman- 
tic categories, e.g., patient, doctor, medica- 
tion, treatment, and diagnosis. 


Semantic network A knowledge source in 
the UMLS that provides a consistent cat- 
egorization of all concepts represented in 
the Metathesaurus in which each concept is 
assigned at least one semantic type. 
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Semantic patterns The study of the patterns 
formed by the co-occurrence of individual 
words in a phrase of the co-occurrence of the 
associated semantic types of the words. 


Semantic relations A classification of the 
meaning of a linguistic relationship, e.g., 
“treated in 1995” signifies time while “treated 
in ER” signifies location. 


Semantic sense In NLP, the distinction 
between individual word meaning of terms 
that may be in the same semantic class. 


Semantic types The categorization of words 
into semantic classes according to meaning. 
Usually, the classes that are formed are rel- 
evant to specific domains. 


Semantic Web A future view which envi- 
sions the Internet not only as a source of 
content but also as a source of intelligently 
linked, agent-driven, structured collections of 
machine-readable information. 


Semantics The meaning of individual words 
and the meaning of phrases or sentences con- 
sisting of combinations of words. 


Semi structured interview Where the investi- 
gator specifies in advance a set of topics that 
he would like to address but is flexible about 
the order in which these topics are addressed, 
and is open to discussion of topics not on the 
pre-specified list. 


Sender In data interchange, the program or 
system that sends a transmitted message. 


Sensitivity (of a test) The probability of a 
positive result, given that the condition under 
consideration is present—for example, the 
probability of a positive test result in a person 
who has the disease under consideration (also 
called the true-positive rate). 


Sentence boundary In NLP, distinguishing 
the end of one sentence and the beginning of 
the next. 


Sentiment analysis The study of how symbols 
or signs are used to designate the meaning of 
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words and the study of how words combine to 
form or fail to form meaning. 


Sequence alignment An arrangement of two 
or more sequences (usually of DNA or RNA), 
highlighting their similarity. The sequences 
are padded with gaps (usually denoted by 
dashes) so that wherever possible, columns 
contain identical or similar characters from 
the sequences involved. 


Sequence database A database that stores the 
nucleotide or amino acid sequences of genes 
(or genetic markers) and proteins respectively. 


Sequence information Information from a 
database that captures the sequence of com- 
ponent elements in a biological structure (e.g., 
the sequence of amino acids in a protein or of 
nucleotides in a DNA segment). 


Sequential Bayes A reasoning method based 
on a naive Bayesian model, where Bayes’ 
rule is applied sequentially for each new 
piece of evidence that is provided to the sys- 
tem. With each application of Bayes’ rule, 
the posterior probability of each diagnostic 
possibility is used as the new prior probabil- 
ity for that diagnosis the next time Bayes’ 
rule is invoked. 


Server A computer that shares its resources 
with other computers and supports the activi- 
ties of many users simultaneously within an 
enterprise. 


Service An intangible activity provided to 
consumers, generally at a price, by a (presum- 
ably) qualified individual or system. 


Service oriented architectures (SOA) A soft- 
ware design framework that allows specific 
processing or information functions (services) 
to run on an independent computing platform 
that can be called by simple messages from 
another computer application. Often con- 
sidered to be more flexible and efficient than 
more traditional data base architectures. Best 
known example is the Internet which is based 
largely on SOA design principles. 
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Set-based searching Constraining a search to 
include only documents in a given class or set 
(e.g., from a given institution or Journal). 


Set-top box A device, such as a cable box, 
that converts video content to analog or digi- 
tal television signals. 


Shallow parsing See partial parsing. 


Shielding In cabling, refers to an outer layer 
of insulation covering an inner layer of con- 
ducting material. Shielded cable is used to 
reduce electronic noise and voltage spikes. 


Short-term/working memory An emergent 
property of interaction with the environment; 
refers to the resources needed to maintain 
information active during cognitive activity. 


Signal processing An area of systems engi- 
neering, electrical engineering and applied 
mathematics that deals with operations on or 
analysis of signals, or measurements of time- 
varying or spatially-varying physical quanti- 
ties. 


Simple Mail Transport Protocol (SMTP) The 
standard protocol used by networked systems, 
including the Internet, for packaging and dis- 
tributing email so that it can be processed by a 
wide variety of software systems. 


Simple Object Access Protocol (SOAP) A pro- 
tocol for information exchange through the 
HTTP/HTTPS or SMTP transport protocol 
using web services and utilizing Extensible 
Markup Language (XML) as the format for 
messages. 


Simulation A system that behaves according 
to a model of a process or another system; for 
example, simulation of a patient’s response to 
therapeutic interventions allows a student to 
learn which techniques are effective without 
risking human life. 


Simulation center Specialized type of learn- 
ing center, though its governance may reside 


in an academic department such as anesthe- 
siology or surgery depending on the center’s 
origin and history. 


Simultaneous access Access to shared, com- 
puter-stored information by multiple con- 
current users. 


Simultaneous controls Use of participants in 
a comparative study who are not exposed to 
the information resource. They can be ran- 
domly allocated to access to the information 
resource or in some other way. 


Single nucleotide polymorphism (SNP) A 
DNA sequence variation, occurring when a 
single nucleotide in the genome is altered. For 
example, a SNP might change the nucleotide 
sequence AAGCCTA to AAGCTTA. A vari- 
ation must occur in at least 1% of the popula- 
tion to be considered a SNP. 


Single-photon emission computed tomogra- 
phy A nuclear medicine tomographic imag- 
ing technique using gamma rays. It is very 
similar to conventional nuclear medicine pla- 
nar imaging using a gamma camera. However, 
it is able to provide true 3D information. This 
information is typically presented as cross- 
sectional slices through the patient, but can 
be freely reformat- ted or manipulated as 
required. 


Single-user systems Computers designed for 
use by single individuals, such as personal 
computers, as opposed to servers or other 
resources that are designed to be shared by 
multiple people at the same time. 


Six sigma A management strategy that seeks 
to improve the quality of work processes by 
identifying and removing the causes of defects 
and minimizing the variability of those pro- 
cesses. Statistically, a six sigma process is one 
that is free of defects or errors 99.99966%, 
which equates to operating a process that fits 
six standard deviations between the mean 
value of the process and the specification limit 
of that process. 
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Slip A type of medical error that occurs when 
the actor selects the appropriate course of 
action, but it was executed inappropriately. 


Slots In a frame-based representation, the 
elements that are used to define the semantic 
characteristics of the frame. 


SMART See: Substitutable Medical Appli- 
cations and Reusable Technologies. 


SMART on FHIR An open, standards-based 
platform for medical apps to access patients’ 
data from electronic medical records. SMART 
on FHIR builds on two technology efforts: the 
Substitutable Medical Applications, Reusable 
Technologies (SMART) Platforms Project and 
Fast Health Information Resources (FHIR). 


Smart phones A mobile telephone that typi- 
cally integrates voice calls with access to the 
Internet to enable both access to web sites and 
the ability to download email and applica- 
tions that then reside on the device. 


Smartwatch A type of wearable computer 
in the form of a wristwatch. Typically pro- 
vides health monitoring features, ability to 
run simple third-party apps, and WiFi or 
Bluetooth connectivity, in addition to tell- 
ing time. 


SMS messaging The sending of messages 
using the text communication service compo- 
nent of phone, web or mobile communication 
system-Short Message Service. 


SNOMED Systematized Nomenclature of 
Medicine—A set of standardized medical 
terms that can be processed electronically; 
useful for enhancing the standardized use of 
medical terms in clinical systems. 


SNOMED-CT The result of the merger of an 
earlier version of SNOMED with the Read 


Clinical Terms. 


SNP See Single nucleotide polymorphism. 
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Social determinants of health Conditions 
in which people live, learn, work, and play. 
Negative examples include: poverty, poor 
access to healthy foods, substandard educa- 
tion, unsafe neighborhoods. 


Social networking The use of a dedicated 
Web site to communicate informally (on the 
site, by email, or via SMS messages) with 
other members of the site, typically by post- 
ing messages, photographs, etc. 


Sociotechnical systems An approach to the 
study of work in complex settings that empha- 
sizes the interaction between people and tech- 
nology in workplaces. 


Software Computer programs that direct the 
hardware how to carry out specific automated 
processes. 


Software development life cycle (SDLC) or 
software development process A framework 
imposed over software development in order 
to better ensure a repeatable, predictable pro- 
cess that controls cost and improves quality of 
a software product. 


Software oversight committee A groups 
within organizations that is constituted to 
oversee computer programs and to assess 
their safety and efficacy in the local set- 
ting. 


Software psychology A behavioral approach 
to understanding and furthering software 
design, specifically studying human beings’ 
interactions with systems and software. It is 
the intellectual predecessor to the discipline 
of Human-Computer interaction. 


Solid state drive (SSD) A data storage device 
using integrated circuit assemblies as memory 
to store data persistently. SSDs have no mov- 
ing mechanical components, which distinguish 
them from traditional electromechanical mag- 
netic disks such as hard disk drives (HDDs) 
or floppy disks, which contain spinning disks 
and movable read/write heads. 
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Spamming The process of sending unso- 
licited email to large numbers of unwilling 
recipients, typically to sell a product or make 
a political statement. 


Spatial resolution A measure of the ability 
to distinguish among points that are close to 
each other (indicated in a digital image by the 
number of pixels per square inch). 


Specialist Lexicon One of three UMLS 
Knowledge Sources, this lexicon is intended 
to be a general English lexicon that includes 
many biomedical terms and supports natural 
language processing. 


Specificity (of a test) The probability of a 
negative result, given that the condition under 
consideration is absent—for example, the 
probability of a negative test result in a person 
who does not have a disease under consider- 
ation (also called the true-negative rate). 


Spectrum bias Systematic error in the esti- 
mate of a study parameter that results when 
the study population includes only selected 
subgroups of the clinically relevant popula- 
tion—for example, the systematic error in the 
estimates of sensitivity and specificity that 
results when test performance is measured in 
a study population consisting of only healthy 
volunteers and patients with advanced dis- 
ease. 


Speech recognition Translation by computer 
of voice input, spoken using a natural vocab- 
ulary and cadence, into appropriate natural 
language text, codes, and commands. 


Spelling check A procedure that checks 
the spelling of individual words in entered 
data. 


Spirometer An instrument for measuring the 
air capacity of the lungs. 


Standard of care The community-accepted 
norm for management of a specified clinical 
problem. 


Standard order sets Predefined lists of steps 
that should be taken to deal with certain 
recurring situations in the care of patients, 
typically in hospitals; e.g., orders to be fol- 
lowed routinely when a patient is in the post- 
surgical recovery room. 


Standard-gamble A technique for utility 
assessment that enables an analyst to deter- 
mine the utility of an outcome by comparing 
an individual’s preference for a chance event 
when compared with a situation of certain 
outcome. 


Standards development organizations An 
organization charged with developing a stan- 
dard that is accepted by the community of 
affected individuals. 


Static In patient simulations, a program that 
presents a predefined case in detail but which 
does not vary in its response depending on the 
actions taken by the learner. 


Stemming The process of converting a word 
to its root form by removing common suffixes 
from the end. 


Stop words In full-text indexing, a list of words 


that are low in semantic content (e.g., “the”, 
ee II 66 


a”, “an”) and are generally not useful as mech- 
anisms for retrieving documents. 


Storage devices A piece of computer equip- 
ment on which information can be stored. 


Store-and-forward A telecommunications 
technique in which information is sent to an 
intermediate station where it is kept and sent 
at a later time to the final destination or to 
another intermediate station. 


Strict product liability The principle that 
states that a product must not be harmful. 


Structural alignment The study of methods 
for organizing and managing diverse sources 
of information about the physical organiza- 
tion of the body and other physical structures. 
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Structural informatics The study of methods 
for organizing and managing diverse sources 
of information about the physical organiza- 
tion of the body and other physical structures. 
Often used synonymously with “imaging 
informatics”. 


Structure validation A study carried out to 
help understand the needs for an information 
resource, and demonstrate that its proposed 
structure makes sense to key stakeholders. 


Structured data entry A method of human- 
computer interaction in which users fill in 
missing values by making selections from pre- 
defined menus. The approach discretizes user 
input and makes it possible for a computer 
system to reason directly with the data that 
are provided. 


Structured encounter form A form for col- 
lecting and recording specific information 
during a patient visit. 


Structured interview An interview with a 
schedule of questions that are always pre- 
sented in the same words and in the same 
order. 


Structured Query Language (SQL) A com- 
monly used syntax for retrieving information 
from relational databases. 


Structured reports A report where the con- 
tent of the report has coded values for the key 
information in each pre-specified part of the 
report, enabling efficient and reliable compu- 
tation on the report. 


Study arm in the context of clinical research, 
a study arm represents a specific modality of 
an experimental intervention to which a par- 
ticipant is assigned, usually through a process 
of randomization (e.g., random assigned in a 
balanced manner to such an arm). Arms are 
used in clinical study designs where multiple 
variants of a given experimental intervention 
are under study, for example, varying the tim- 
ing or dose of a given medication between 
arms to determine an optimal therapeutic 
strategy. 
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Study population The population of sub- 
jects— usually a subset of the clinically rel- 
evant population—in whom experimental 
outcomes (for example, the performance of a 
diagnostic test) are measured. 


Subheadings In MeSH, qualifiers of subject 
headings that narrow the focus of a term. 


Subjectivist approaches Class of approaches 
to evaluation that rely primarily on qualitative 
data derived from observation, interview, and 
analysis of documents and other artifacts. 
Studies under this rubric focus on description 
and explanation; they tend to evolve rather 
than be prescribed in advance. 


Sublanguage Language of a specialized 
domain, such as medicine, biology, or law. 


Substitutable Medical Applications and 
Reusable Technologies (SMART) A technical 
platform enables EHR systems to behave as 
“iPhone-like platforms” through an applica- 
tion programming interface (API) and a set 
of core services that support easy addition 
and deletion of third party apps, such that 
the core system is stable and the apps are sub- 
stitutable. 


Summarization A computer system that 
attempts to automatically summarize a larger 
body of content. 


Summary ROC curve A composite ROC curve 
developed by using estimates from many 
studies. 


Summative evaluation after the product is in 
use, is valuable both to justify the completed 
project and to learn from one’s mistakes. 


Supervised learning An approach to machine 
learning in which an algorithm uses a set of 
inputs and corresponding outputs to try to 
learn a model that will enable prediction of an 
output when faced with a previously unseen 
input. 


Supervised learning technique A method 
for determining how data values may sug- 
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gest classifications, where the possible classi- 
fications are enumerated in advance, and the 
performance of a system is enhanced by eval- 
uating how well the system classifies a training 
set of data. Statistical regression, neural net- 
works, and support vector machines are forms 
supervised learning. 


Supervised machine learning A machine 
learning approach that uses a gold standard 
set as input to learn classifiers. 


Surveillance The ongoing collection, analy- 
sis, interpretation, and dissemination of 
data on health conditions (e.g., breast can- 
cer) and threats to health (e.g., smoking 
prevalence). In a computer-based medical 
record system, systematic review of patients’ 
clinical data to detect and flag conditions 
that merit attention. Also see public health 
surveillance and biosurveillance. 


Symbolic-programming language A pro- 
gramming language in which the program 
can treat itself, or material like itself, as data. 
Such programs can write programs (not just 
as character strings or texts, but as the actual 
data structures that the program is made of). 
The best known and most influential of these 
languages is LISP. 


Syndromic surveillance A particular type of 
public health surveillance. It is an ongoing 
process of monitoring clinical data, generally 
from public health, hospital, or outpatient 
resources, or surrogate data indicating early 
illness (e.g., school or work absenteeism) with 
a goal of early identification of outbreaks, 
new conditions, health threats, or bioterrorist 
events. 


Synonyms Multiple ways of expressing the 
same concept. 


Syntax The grammatical structure of lan- 
guage describing the relations among words 
in a sentence. 


System programs The operating system, com- 
pilers, and other software that are included 


with a computer system and that allow users 
to operate the hardware. 


Systematic review A type of journal article 
that reviews the literature related to a specific 
clinical question, analyzing the data in accor- 
dance with formal methods to assure that 
data are suitably compared and pooled. 


Systems biology Research on biological 
networks or biochemical pathways. Often, 
systems biology analyses take a comprehen- 
sive approach to model biological function 
by taking the interactions (physical, regula- 
tory, similarity, etc.) of a set of genes as a 
whole. 


Tablet Generally refers to a personal com- 
puting device that resembles a paper tablet in 
size and incorporates features such as a touch 
screen to facilitate data entry. 


Tactile feedback In virtual or telepres- 
ence environments, the process of providing 
(through technology) a sensation of touch- 
ing an object that is imaginary or otherwise 
beyond the user’s reach (see also haptic feed- 
back). 


TCP/IP Transmission Control Protocol/ 
Internet Protocol—A set of standard commu- 
nications protocols used for the Internet and 
for net- works within organizations as well. 


Teleconsultation The use of telemedicine 
techniques to support the interaction between 
two (or more) clinicians where one is provid- 
ing advice to the other, typically about a spe- 
cific patient’s care. 


Telegraphic In NLP, describes language that 
does not follow the usual rules of grammar 
but is compact and efficient. Clinical notes 
written by hand often demonstrate a “tele- 
graphic style”. 


Telehealth The use of electronic information 
and telecommunications technologies to sup- 
port long-distance clinical health care, patient 
and professional health-related education, 
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public health and health administration. See 
telemedicine. 


Telehome care The use of communications 
and information technology to deliver health 
services and to exchange health information 
to and from the home (or community) when 
distance separates the participants. 


Tele-ICU See remote intensive care. 


Telemedicine A broad term used to describe 
the delivery of health care at a distance, 
increasingly but not exclusively by means of 
the Internet. 


Teleophthalmology The use of telemedicine 
methods to deliver ophthalmology services. 


Telepresence A technique of telemedicine 
in which a viewer can be physically removed 
from an actual surgery, viewing the abnormal- 
ity through a video monitor that displays the 
operative field and allows the observer to par- 
ticipate in the procedure. 


Telepsychiatry The use of telemedicine meth- 
ods to deliver psychiatric services. 


Teleradiology The provision of remote inter- 
pretations, increasing as a mode of delivery of 
radiology services. 


Telesurgery The use of advanced telemedi- 
cine methods to allow a doctor to perform 
surgery on a patient even though he or she is 
not physically in the operating room. 


Temporal resolution A metric for how well 
an imaging modality can distinguish points in 


time that are very close together. 


Terabyte A unit of information equal to one 
million million (10!) or strictly, 2% bytes. 


Term A word or phrase. 


Term Designation of a defined concept by a 
linguistic expression in a special language. In 
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information retrieval, a word or phrase which 
forms part of the basis for a search request. 


Term frequency (TF) In information retrieval, 
a measurement of how frequently a term 
occurs in a document. 


Term weighting The assignment of metrics 
to terms so as to help specify their utility in 
retrieving documents well matched to a query. 


Terminal A simple device that has no process- 
ing capability of its own but allows a user to 
access a server. 


Terminology A set of terms representing the 
system of concepts of a particular subject 
field. 


Terminology authority An entity or mecha- 
nism that determines the acceptable term to 
use for a specific entity, descriptor, or other 
concept. 


Terminology services Software methods, typ- 
ically based on computer-based dictionaries 
or language systems, that allow other systems 
to determine the locally acceptable term to 
use for a given purpose. 


Test collection In the context of information 
retrieval, a collection of real-world content, a 
sampling of user queries, and relevance judg- 
ments that allow system-based evaluation of 
search systems. 


Test-interpretation bias Systematic error in 
the estimates of sensitivity and specificity that 
results when the index and gold-standard test 
are not interpreted independently. 


Test-referral bias Systematic error in the esti- 
mates of sensitivity and specificity that results 
when subjects with a positive index test are 
more likely to receive the gold-standard test. 


Tethered personal health record An EHR 
portal that is provided to patients by an insti- 
tution and can typically be used to manage 
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information only from that provider organi- 
zation. 


Text generation Methods that create coher- 
ent natural language text from structured data 
or from textual documents in order to satisfy 
a communication goal. 


Text mining The use of large text collec- 
tions (e.g., medical histories, consultation 
reports, articles from the literature, web-based 
resources) and natural language processing 
to allow inferences to be drawn, often in the 
form of associations or knowledge that were 
not previously apparent. See also data mining. 


Text processing The analysis of text by com- 
puter. 


Text readability assessment and simplifica- 
tion An application of NLP in which compu- 
tational methods are used to assess the clarity 
of writing for a certain audience or to revise 
the exposition using simpler terminology and 
sentence construction. 


Text REtrieval Conference (TREC) Organized 
by NIST, an annual conference on text 
retrieval that has provided a testbed for evalu- 
ation and a forum for presentation of results. 
(see > trec.nist.gov). 


Text summarization Takes one or several doc- 
uments as input and produces a single, coher- 
ent text that synthesizes the main points of 
the input documents. 


Text-comprehension A process in which text 
can be described at multiple levels of realiza- 
tion from surface codes (e.g., words and syn- 
tax) to deeper level of semantics. 


TF*IDF weighting A specific approach to term 
weighting which combines the inverse docu- 
ment frequency (IDF) and term frequency 
(TF). 


Thesaurus A set of subject headings or 
descriptors, usually with a cross-reference sys- 
tem for use in the organization of a collection 
of documents for reference and retrieval. 


Thick-client A computer node in a network 
or client-server architecture that provides 
rich functionality independent of the central 
server. See also thin client. 


Thin client A program on a local computer 
system that mostly provides connectivity to 
a larger resource over a computer network, 
thereby providing access to computational 
power that is not provided by the machine, 
which is local to the user. 


Think-aloud protocol In cognitive science, 
the generation of a description of what a per- 
son is thinking or considering as they solve a 
problem. 


Thread The smallest sequence of pro- 
grammed instructions that can be managed 
independently by an operating system sched- 
uler. 


Three-dimensional printing Construction of 
a physical model of anatomy or other object 
by laying down plastic versions of a stack of 
cross-sectional slices through the object. 


Three-dimensional structure information In 
a biological database, information regarding 
the three-dimensional relationships among 
elements in a molecular structure. 


Time-sharing networks An historical term 
describing some of the earliest computer net- 
works allowing remote access to systems. 


Time-trade-off A common approach to util- 
ity assessment, comparing a better state of 
health lasting a shorter time, with a lesser state 
of health lasting a longer time. The time-trad- 
eoff technique provides a convenient method 
for valuing outcomes that accounts for gains 
(or losses) in both length and quality of life. 


Tokenization The process of breaking an 
unstructured sequence of characters into 
larger units called “token,” e.g., words, num- 
bers, dates and punctuation. 


Tokens In language processing, the compos- 
ite entities constructed from individual char- 
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acters, typically words, numbers, dates, or 
punctuation. 


Top-down In search or analysis, the breaking 
down of a system to gain insight into its com- 
positional subsystems. 


Topology In networking, the overall connec- 
tivity of the nodes in a network. 


Touch screen A display screen that allows 
users to select items by touching them on the 
screen. 


Track pad A computer input device for con- 
trolling the pointer on a display screen by slid- 
ing the finger along a touch-sensitive surface: 
used chiefly in laptop computers. Also called 
a touchpad. 


Transaction set In data transfer, the full set of 
information exchanged between a sender and 
a receiver. 


Transcription The conversion of a recording 
of dictated notes into electronic text by a typ- 
ist. 


Transcriptomics The study of the set of RNA 
transcripts that are produced by the genome 
and the context (specific cells or circum- 
stances) in which transcription occurs. 


Transition matrix A table of numbers giving 
the probability of moving from one state in 
a Markov model into another state or the 
state that is reached in a finite-state machine 
depending on the current character of the 
alphabet. 


Transition probability The probability that a 
person will transit from one health state to 
another during a specified time period. 


Translational Bioinformatics (TBI) According 
to the AMIA: the development of storage, 
analytic, and interpretive methods to opti- 
mize the transformation of increasingly volu- 
minous biomedical data, and genomic data, 
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into proactive, predictive, preventive, and par- 
ticipatory health. 


Translational medicine Translational medi- 
cine: the process of transferring scientific dis- 
coveries into preventive practice and clinical 
care. 


Transmission control protocol/internet proto- 
col (TCP/IP) The standard protocols used for 
data transmission on the Internet and other 
common local and wide-area networks. 


Transport Layer Security (TLS) A protocol 
that ensures the privacy of data transmit- 
ted over the Internet. It grew out of Secure 
Sockets Layer. 


Treatment threshold probability The prob- 
ability of disease at which the expected val- 
ues of withholding or giving treatment are 
equal. Above the threshold treatment is rec- 
ommended; below the threshold, treatment is 
not recommended and further testing may be 
warranted. 


Trigger event In monitoring, events that 
cause a set of transactions to be generated. 


True negative In assessing a situation, an 
instance that is classified negatively and is 
subsequently shown to have been correctly 
classified. 


True positive In assessing a situation, an 
instances that is classified positively and is 
subsequently shown to have been correctly 
classified. 


True-negative rate (TNR) The probability of a 
negative result, given that the condition under 
consideration is false—for example, the prob- 
ability of a negative test result in a patientwho 
does not have the disease under consideration 
(also called specificity). 


True-negative result (TN) A negative result 
when the condition under consideration is 
false—for example, a negative test result in a 
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patient who does not have the disease under 
consideration. 


True-positive rate (TPR) The probability of a 
positive result, given that the condition under 
consideration is true—for example, the prob- 
ability of a positive test result in a patient 
who has the disease under consideration (also 
called sensitivity). 


True-positive result (TP) A positive result 
when the condition under consideration is 
true—for example, a positive test result in a 
patient who has the disease under consider- 
ation. 


Turn-around-time The period for completing 
a process cycle, commonly expressed as an 
average of previous such periods. 


Tutoring A computer program designed to 
provide self-directed education to a student or 
trainee. 


Tutoring system A computer program 
designed to provide self-directed education to 
a student or trainee. (Also Intelligent Tutoring 
System). 


Twisted-pair wires The typical copper wiring 
used for routine telephone service but adapt- 
able for newer communication technologies. 


Type-checking In computer programming, 
the act of checking that the types of values, 
such as integers, decimal numbers, and strings 
of characters, match throughout their use. 


Typology A way of classifying things to make 
sense of them, for a certain purpose. 


Ubiquitous computing A form of computing 
and human-computer interaction that seeks 
to embed computing power invisibly in all 
facets of life. 


Ultrasound A common energy source derived 
from high-frequency sound waves. 


UMLS See: Unified Medical 


System. 
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UMLS Knowledge Sources Components of 
the Unified Medical Language System that 
support its use and semantic breadth. 


UMLS Semantic Network A knowledge source 
in the UMLS that provides a consistent cat- 
egorization of all concepts represented in the 
Metathesaurus. Each Metathesaurus concept 
is assigned at least one semantic type from the 
Semantic Network. 


Unicode Represents characters needed for 
foreign languages using up to 16 bits. 


Unified Medical Language System (UMLS) 
Project A terminology system, developed 
under the direction of the National Library 
of Medicine, to produce a common structure 
that ties together the various vocabularies that 
have been created for biomedical domains. 


Unified Modeling Language (UML) A stan- 
dardized general-purpose modeling language 
developed for object-oriented software engi- 
neering that provides a set of graphic notation 
techniques to create visual models that depict 
the relationships between actors and activities 
in the program or process being modeled. 


Uniform resource identifier (URI) The combi- 
nation of a URN and URL, intended to pro- 
vide persistent access to digital objects. 


Uniform resource locator (URL) The address 
of an information resource on the World 
Wide Web. 


Uniform resource name (URN) A name for 
a Web page, intended to be more persistent 
than a URL, which often changes over time as 
domains evolve or Web sites are reorganized. 


Unique health identifier (UHI) A government- 
provided number that is assigned to an indi- 
vidual for purposes of keeping track of their 
health information. 
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Universal Serial Bus(USB) A connection tech- 
nology for attaching peripheral devices to a 
computer, providing fast data exchange. 


Unobtrusive measures Measures made using 
the records accrued as part of the routine use 
of the information resource, including, for 
example, user log files. 


Unstructured interview An interview where 
there are no predetermined questions. 


Unsupervised machine learning A machine 
learning approach that learns patterns from 
the data without labeled training sets. 


URAC An organization that accredits the 
quality of information from various sources, 
including health-related Web sites. 


Usability The quality of being able to provide 
good service to one who wishes to use a prod- 
uct. 


Usability testing A class of methods for col- 
lecting empirical data of representative users 
performing representative tasks; considered 
the gold standard in usability evaluation 
methods. 


User authentication The process of identify- 
ing a user of an information resource, and 
verifying that the user is allowed to access 
the services of that resource. A standard user 
authentication method is to collect and verify 
a username and password. 


User-centered design An iterative process in 
which designers focus on the users and their 
needs in each phase of the design process. 
UCD calls for involving users throughout the 
design process via a variety of research and 
design techniques to increase the likelihood 
that the product will be highly usable by its 
intended users. 


User-interface layer The architectural layer 
of a software environment that handles the 
interface with users. 


1087 


Utility In decision making, a number that 
represents the value of a specific outcome to 
a decision maker (see, for example, quality- 
adjusted life-years). 


Validity check A set of procedures applied to 
data entered into an EHR intended to detect 
or prevent the entry of erroneous data; e.g., 
range checks and pattern checks. 


Value-based reimbursement In health care, 
an alternative to traditional fee-for-service 
reimbursement, aimed at rewarding quality 
rather than quantity of services. 


Variable Quantity measured in a study. 
Variables can be measured at the nominal, 
ordinal, interval, or ratio levels. 


Vector mathematics In the context of infor- 
mation retrieval, mathematical systems for 
measuring and comparing vector representa- 
tions of documents and their contents. 


Vector-space model A method of full-text 
indexing in which documents can be concep- 
tualized as vectors of terms, with retrieval 
based on the cosine similarity of the angle 
between the query and document vectors. 


Vendor-neutral archives (VNA) A technol- 
ogy in which images (and potentially any file 
of clinical relevance) is stored (archived) in a 
standard format with a standard interface (e.g., 
DICOM), such that they can be accessed in a 
vendor-neutral manner by other systems. 


Vertically integrated Refers to an organiza- 
tional structure in which a variety of products 
or services are offered within a single chain of 
command; contrasted with horizontal inte- 
gration in which a single type of product is 
offered in different geographical markets. A 
hospital that offers a variety of services from 
obstetrics to geriatrics would be “vertically 
integrated.” A diagnostic imaging organiza- 
tion with multiple sites would be “horizon- 
tally integrated”. 
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Veterinary informatics The application of 
biomedical informatics methods and tech- 
niques to problems derived from the field of 
veterinary medicine. Viewed as a subarea of 
clinical informatics. 


Video-display terminal (VDT) A device for 
displaying input signals as characters on a 
screen, typically a computer monitor. 


View In a database-management system, a 
logical submodel of the contents and struc- 
ture of a database used to support one or a 
subset of applications. 


View schemas An application-specific 
description of a view that supports that pro- 
gram’s activities with respect to some general 
database for which there are multiple views. 


Virtual address A technique in memory man- 
agement such that each address referenced by 
the CPU goes through an address mapping 
from the virtual address of the program to a 
physical address in main memory. 


Virtual medical record A standard model 
of the data elements found in EHR systems. 
The virtual medical record approach assumes 
that, even if particular EHR implementations 
adopt nonstandard data dictionaries and dis- 
parate ways for storing clinical data, map- 
ping the contents of each EHR to a canonical 
model greatly simplifies interoperability with 
CDS Systems and other applications that may 
need to access the data. 


Virtual memory A scheme by which users can 
access information stored in auxiliary mem- 
ory as though it were in main memory. Virtual 
memory addresses are automatically trans- 
lated into actual addresses by the hardware. 


Virtual patient A digital representation of 
a patient encounter that can range from a 
simple review of clinical findings to a realis- 
tic graphical view of a person who can con- 
verse and can be examined for various clinical 
symptoms and laboratory tests. 


Virtual Private Network (VPN) A private com- 
munications network, usually used within a 
company or organization, or by several dif- 
ferent companies or organizations, commu- 
nicating over a public network. VPN message 
traffic is carried on public networking infra- 
structure (e.g., the Internet) using standard 
(often insecure) protocols. 


Virtual reality A collection of interface meth- 
ods that simulate reality more closely than 
does the standard display monitor, gener- 
ally with a response to user maneuvers that 
heighten the sense of being connected to the 
simulation. 


Virtual world A three-dimensional represen- 
tation of an environment such as a hospital, a 
clinic or ahome-care location. The represented 
space usually includes a virtual patient, and 
interactive equipment and supplies that can 
be used to examine and care for the patient. 
Some virtual worlds are multi-user and allow 
multiple learners to manifest themselves as 
characters in the virtual world for interaction 
with each other and the patient. 


Virus/worm A software program that is writ- 
ten for malicious purposes to spread from one 
machine to another and to do some kind of 
damage. Such programs are generally self- 
replicating, which has led to the comparison 
with biological viruses. 


Visual-analog scale A method for valuing 
health outcomes, wherein a person simply 
rates the quality of life with a health outcome 
on a scale from 0 to 100. 


Vocabulary A dictionary containing the ter- 
minology of a subject field. 


Volatile A characteristic of a computer’s 
memory, in that contents are changed when 
the next program runs and are not retained 
when power is turned off. 


Volume rendering A method whereby a com- 
puter program projects a two-dimensional 
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image directly from a three-dimensional 
voxel array by casting rays from the eye of 
the observer through the volume array to the 
image plane. 


vonNeuman machine A computer architec- 
ture that comprises a single processing unit, 
computer memory, and a memory bus. 


Voxel A volume element, or small cubic area 
of a three-dimensional digital image (see 
pixel). 


Washington DC Principles for Free Access to 
Science An organization of non-profit pub- 
lishers that aims to balance wide access with 
the need to maintain sustainable revenue 
models. 


Wearables In the context of mobile health, 
wearables refer to a range of electronic devices 
that can be incorporated into clothing or worn 
on the body, such as smartwatches, activity 
trackers, and physiological sensors, that are 
used to collect health-related data and provide 
health interventions. Also referred to as wear- 
able devices or wearable technologies. 


Web browser A computer program used to 
access and display information resources on 
the World Wide Web. 


Web catalog Web pages containing mainly 
links to other Web pages and sites. 


Web Services Discovery Language (WSDL) An 
XML-based language used to describe the 
attributes of a web service, such as a SOAP 
service. 


Web-based technologies Computer capabili- 
ties that rely on the architecture principles of 
the Internet for accessing data from remote 
servers. 


Weblogs/blogs A type of Web site that pro- 
vides discussion or information on various 
topics. 
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WebMD An American company that provides 
web-based health information services. 


Whole Slide Digitization The process of cap- 
turing an entire specimen on a slide into a dig- 
ital image. Compared with capturing images 
of a single field of view from a microscope, 
this captures the entire specimen, and can be 
millions of pixels on a side. This allows subse- 
quent or remote review of the specimen with- 
out requiring capture of individual fields. 


Wide-area networks (WANs) A network that 
connects computers owned by independent 
institutions and distributed over long dis- 
tances. 


Wi-Fi A common wireless networking tech- 
nology (IEEE 802.11x.) that uses radio waves 
to provide high-speed connections to the 
Internet and local networks. 


Word In computer memory, a sequence of 
bits that can be accessed as a unit. 


Word sense disambiguation (WSD) The pro- 
cess of determining the correct sense of a 
word in a given context. 


Word senses The possible meanings of a 
term. 


Word size The number of bits that define a 
word in a given computer. 


Workstation A powerful desktop computer 
system designed to support a single user. 
Workstations provide specialized hardware 
and software to facilitate the problem-solving 
and information-processing tasks of profes- 
sionals in their domains of expertise. 


World Intellectual Property Organization 
(WIPO) An international organization, head- 
quartered in Geneva and dedicated to pro- 
moting the use and protection of intellectual 


property. 
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World Wide Web (WWW or Web) An applica- 
tion implemented on the Internet in which 
multimedia information resources are made 
accessible by any of a number of protocols, 
the most common of which is the HyperText 
Transfer Protocol (HTTP). 


Worm A self-replicating computer program, 
similar to a computer virus; a worm is self- 
contained and does not need to be part of 
another program to propagate itself. 


xAPI Experience Application Programming 
Interface goes beyond interoperability stan- 
dards, such as SCORM, and supports col- 


lection of data about the learner’s experience 
while using the learning object. 


XML A metalanguage that allows users to 
define their own customized markup lan- 
guages. See Extensible Markup Language. 


X-ray crystallography A technique in crystal- 
lography in which the pattern produced by 
the diffraction of x-rays through the closely 
spaced lattice of atoms in a crystal is recorded 
and then analyzed to reveal the nature of that 
lattice, generally leading to an understanding 
of the material and molecular structure of a 
substance. 
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